High level promoters from cyanobacteria

ABSTRACT

The invention relates to the field of microbiology. More specifically, methods are provided for the identification of highly expressed genes and their corresponding promoters and UV responsive genes and their corresponding promoters in cyanobacteria Synechocystis sp. PCC6803. These genes and promoters can be used to construct expression vectors in cyanobacteria, green algae or plants, for the production of biomaterials from sunlight, a renewable energy resource.

[0001] This application claims the benefit of U.S. Provisional Application No. 60/264,925, filed Jan. 30, 2001.

FIELD OF THE INVENTION

[0002] The invention relates to the field of microbiology. More specifically, the invention relates to high-level expression promoters and UV responsive promoters in cyanobacteria Synechocystis sp. PCC6803.

BACKGROUND OF THE INVENTION

[0003] The UV-B (290-320 nm) component of sunlight generates significant damage on biological systems ranging from bacteria to plants and humans. The main targets of UV-B irradiation are transfer RNA (tRNA), proteins, lipids, and, in particular, photosystems of photosynthetic organisms including plants, algae and cyanobacteria (Garcia-Pichel, Origins of Life and Evolution of the Biosphere 1998, 28:321-47). Photosynthetic organisms have adapted many different mechanisms to combat the damaging effect of UV-B irradiation, such as reducing photosynthesis and synthesizing UV protective molecules (Ehling-Schultz and Scherer, 1999. Eur. J. Phycol., 34:329-338). The latter may be of interest for use in protection of materials easily damaged by sunlight, or for developing sunscreens.

[0004] The mechanism by which photosynthetic organisms adapt to UV-B light is not completely understood. While several studies have examined the effect of UV and white light on cyanobacteria (Mate et al., J. Biol. Chem. 1998, 273 (28), 17439-17444; Li and Golden, Proc. Natl. Acad. Sci. USA, 1993, 90, 11678-11682; Ehling-Schultz and Scherer, Eur. J. Phycol. 1999, 34, 329-338; Gotz et al., Plant Physiol. 1999, 120 (2) 599-604; Sah et al., Biochem. Mol. Biol. Int. 1998, 44 (2) 245-57; Miroshnichenko Dolganov et al., Proc. Natl. Acad. Sci. USA, 1995, 92:636-640; and Mohamed and Jansson, Plant Mol Biol., 1989, 13:693-700), these authors focused on either the response of single genes or proteins to UV or white light, or certain specific molecules involved in photoprotection. None of these previous studies analyzed a near complete set of the open reading frames in Synechocystis for promoter strength and induction or repression by UV-B light in the 290-320 nm range. The identification of UV-B inducible genes and their promoters would be desirable for identifying UV-B protective compounds as well as for methods of regulating gene expression in cyanobacteria, green algae or plants, for the production of biomaterials from sunlight, a renewable energy resource.

[0005] The problem to be solved, therefore is to identify highly expressed genes and their corresponding strong promoters, and preferably UV-B inducible genes and their corresponding promoters.

[0006] Applicants have solved this problem by characterizing the global response and adaptation mechanism of cyanobacterium Synechocystis sp. PCC6803 to the stress of UV-B light using a novel DNA microarray that comprises a near complete set of open reading frames from this species. Therefore, Applicants' invention provides a group of highly expressed genes, as well as a group of UV-B inducible genes in cyanobacteria Synechocystis sp. PCC 6803 and a collection of useful strong promoters that can be used for gene over-expression either in minimal media, or in response to treatment with UV-B light. The present invention provides a unique approach for controlled overexpression of foreign genes in Synechocystis sp. PCC6803, as well as other cyanobacteria such as Synechococcus and like organisms.

SUMMARY OF THE INVENTION

[0007] The present invention provides two sets of high level expression (i.e., strong) promoters from cyanobacteria Synechocystis sp. PCC6803. These promoters can be employed for engineering gene expression in Synechocystis sp. PCC6803 and constructing expression vectors for use in Synechocystis as well as other cyanobacteria, such as Synechococcus and like organisms. The first set of high-level expression promoters comprises promoters that demonstrate high level expression in log phase growth. The second set of promoters are induced by exposure to UV-B light.

[0008] The invention therefore provides a method for regulating expression of a coding region of interest in a cyanobacterium comprising:

[0009] a) providing a transformed cyanobacterium having a gene fusion comprising:

[0010] i) a promoter region from a gene selected from the group consisting of:

[0011] 1) an amiC gene or an rbcX gene; and

[0012] 2) a gene having a nucleotide sequence as set forth in SEQ ID NO: 5; and

[0013] ii) a coding region of interest;

[0014] wherein the promoter region is operably linked to the coding region of interest; and

[0015] b) culturing the transformed cyanobacterium of step (a), in the log phase whereby the promoter region is activated and the coding region of interest is expressed.

[0016] Additionally the invention provides method for regulating expression of a coding region of interest in a cyanobacterium comprising:

[0017] a) providing a transformed cyanobacterium having a gene fusion comprising:

[0018] i) a promoter region from a gene selected from the group consisting of:

[0019] 1) an hliB gene, an hsp17 gene, a nblB gene, a rpoD gene, an hliA gene, a ftsH gene and a clpB gene; and

[0020] 2) a gene having a nucleotide sequence selected from the group consisting of SEQ ID NOs:9, 11, 17, 21, 25, 27, 31, and 39; and

[0021] ii) a coding region of interest;

[0022] wherein the promoter region is operably linked to the coding region of interest; and

[0023] b) culturing the transformed cyanobacterium of step (a) in the presence of UV-B light, whereby the promoter region is activated and the coding region of interest is expressed.

[0024] Specific cyanobacterium useful in the present invention will be selected from the group consisting of Synechocystis and Synechococcus.

[0025] Specific coding regions of interest useful in the present invention will be selected from the group consisting of crtE, crtB, pds, crtD, crtL, crtZ, crtX crtO, phaC, phaE, efe, pdc, adh, genes encoding limonene synthase, pinene synthase, bornyl synthase, phellandrene synthase, cineole synthase, sabinene synthase, and taxadiene synthase

BRIEF DESCRIPTION OF THE SEQUENCES

[0026] The invention can be more fully understood from the following detailed description and the accompanying sequence descriptions which form a part of this application.

[0027] Sequences contained herein are in conformity with 37 C.F.R. 1.821-1.825 (“Requirements for Patent Applications Containing Nucleotide Sequences and/or Amino Acid Sequence Disclosures—the Sequence Rules”) and consistent with World Intellectual Property Organization (WIPO) Standard ST.25 (1998) and the sequence listing requirements of the EPO and PCT (Rules 5.2 and 49.5(a-bis), and Section 208 and Annex C of the Administrative Instructions). The symbols and format used for nucleotide and amino acid sequence data comply with the rules set forth in 37 C.F.R. §1.822. Clone SEQ ID SEQ ID Description Name Nucleic acid Peptide Nucleotide sequence of an slr0447 1 2 amiC gene Nucleotide sequence of an slr0011 3 4 rbcX gene Nucleotide sequence of a sll1786 5 6 gene of unknown function induced in log phase Nucleotide sequence of an ssr2595 7 8 hliB gene Nucleotide sequence of a slr1544 9 10 gene of unknown function induced by UV-B Nucleotide sequence of a ss0528 11 12 gene of unknown function induced by UV-B Nucleotide sequence of an ssl1514 13 14 hsp17 gene Nucleotide sequence of an slr1687 15 16 nblB gene Nucleotide sequence of a sll1483 17 18 gene of unknown function induced by UV-B Nucleotide sequence of an sll2012 19 20 rpoD gene Nucleotide sequence of a ssl1633 21 22 gene of unknown function induced by UV-B Nucleotide sequence of an ssl2542 23 24 hliA gene Nucleotide sequence of a sll0846 25 26 gene of unknown function induced by UV-B Nucleotide sequence of a slr1674 27 28 gene of unknown function Nucleotide sequence of an slr1604 29 30 ftsH gene Nucleotide sequence of a slr0320 31 32 gene of unknown function induced by UV-B Nucleotide sequence of an sll0306 33 34 rpoD gene Nucleotide sequence of an slr0228 35 36 ftsH gene Nucleotide sequence of a slr1641 37 38 clpB gene Nucleotide sequence of a ssr2016 39 40 gene of unknown function induced by UV-B

DETAILED DESCRIPTION OF THE INVENTION

[0028] Applicants have used a novel DNA microarray to identify the global response and adaptation of cyanobacterium Synechocystis sp. PCC6803 to UV-B light and to identify strong promoters for construction of gene expression vectors in Synechocystis sp. PCC 6803. Specifically, Applicants have identified genes which are highly expressed in log phase growth and genes whose expression is highly induced by UV-B light.

[0029] Applicants' identified genes and promoters which can be used to express coding regions of interest in cyanobacteria.

[0030] In this disclosure, a number of terms and abbreviations are used. The following definitions are provided and should be helpful in understanding the scope and practice of the present invention.

[0031] A “nucleic acid” is a polymeric compound comprised of covalently linked subunits called nucleotides. Nucleic acid includes polyribonucleic acid (RNA) and polydeoxyribonucleic acid (DNA), both of which may be single-stranded or double-stranded. DNA includes cDNA, genomic DNA, synthetic DNA, and semi-synthetic DNA.

[0032] A “nucleic acid molecule” refers to the phosphate ester polymeric form of ribonucleosides (adenosine, guanosine, uridine or cytidine; “RNA molecules”) or deoxyribonucleosides (deoxyadenosine, deoxyguanosine, deoxythymidine, or deoxycytidine; “DNA molecules”), or any phosphoester analogs thereof, such as phosphorothioates and thioesters, in either single stranded form, or a double-stranded helix. Double stranded DNA-DNA, DNA-RNA and RNA-RNA helices are possible. The term nucleic acid molecule, and in particular DNA or RNA molecule, refers only to the primary and secondary structure of the molecule, and does not limit it to any particular tertiary forms. Thus, this term includes double-stranded DNA found, inter alia, in linear or circular DNA molecules (e.g., restriction fragments), plasmids, and chromosomes. In discussing the structure of particular double-stranded DNA molecules, sequences may be described herein according to the normal convention of giving only the sequence in the 5′ to 3′ direction along the non-transcribed strand of DNA (i.e., the strand having a sequence homologous to the mRNA). A “recombinant DNA molecule” is a DNA molecule that has undergone a molecular biological manipulation.

[0033] As used herein, an “isolated nucleic acid fragment” is a polymer of RNA or DNA that is single- or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases. An isolated nucleic acid fragment in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA or synthetic DNA.

[0034] A “gene” refers to an assembly of nucleotides that encode a polypeptide, and includes cDNA and genomic DNA nucleic acids. “Gene” also refers to a nucleic acid fragment that expresses a specific protein, including regulatory sequences preceding (5′ non-coding sequences) and following (3′ non-coding sequences) the coding sequence. “Native gene” refers to a gene as found in nature with its own regulatory sequences. ”Chimeric gene” refers to any gene that is not a native gene, comprising regulatory and coding sequences that are not found together in nature. Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that found in nature. “Endogenous gene” refers to a native gene in its natural location in the genome of an organism. A “foreign” or “heterologous” gene refers to a gene not normally found in the host organism, but that is introduced into the host organism by gene transfer. Foreign genes can comprise native genes inserted into a non-native organism, or chimeric genes. A “transgene” is a gene that has been introduced into the genome by a transformation procedure.

[0035] The terms “3′ non-coding sequences” or “3′ un-translated region (UTR)” refer to DNA sequences located downstream (3′) of a coding sequence and may comprise polyadenylation recognition sequences and other sequences encoding regulatory signals capable of affecting mRNA processing or gene expression. The polyadenylation signal is usually characterized by affecting the addition of polyadenylic acid tracts to the 3′ end of the mRNA precursor.

[0036] ”RNA transcript” refers to the product resulting from RNA polymerase-catalyzed transcription of a DNA sequence. When the RNA transcript is a perfect complementary copy of the DNA sequence, it is referred to as the primary transcript or it may be a RNA sequence derived from post-transcriptional processing of the primary transcript and is referred to as the mature RNA. “Messenger RNA (mRNA)” refers to the RNA that is without introns and that can be translated into protein by the cell.

[0037] As used herein, the term “homologous” in all its grammatical forms and spelling variations refers to the relationship between proteins that possess a “common evolutionary origin”, including proteins from superfamilies and homologous proteins from different species (Reeck et al., 1987, Cell 50:667). Such proteins (and their encoding genes) have sequence homology, as reflected by their high degree of sequence similarity

[0038] ”The term homologue” when referring to a gene will mean a gene of similar function in the same or different species which may have a high degree of nucleic acid or amino acid relatedness.

[0039] The term “corresponding to” is used herein to refer to similar or homologous sequences, whether the exact position is identical or different from the molecule to which the similarity or homology is measured. A nucleic acid or amino acid sequence alignment may include spaces. Thus, the term “corresponding to” refers to the sequence similarity, and not the numbering of the amino acid residues or nucleotide bases.

[0040] ”Promoter” refers to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA. In general, a coding sequence is located 3′ to a promoter sequence. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental or physiological conditions. Promoters which cause a gene to be expressed in most cell types at most times are commonly referred to as “constitutive promoters”. It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of different lengths may have identical promoter activity.

[0041] ”Regulatory region” means a nucleic acid sequence which regulates the expression of a second nucleic acid sequence. A regulatory region may include sequences which are naturally responsible for expressing a particular nucleic acid (a homologous region) or may include sequences of a different origin which are responsible for expressing different proteins or even synthetic proteins (a heterologous region). In particular, the sequences can be sequences of prokaryotic, eukaryotic, or viral genes or derived sequences which stimulate or repress transcription of a gene in a specific or non-specific manner and in an inducible or non-inducible manner. Regulatory regions include origins of replication, RNA splice sites, promoters, enhancers, transcriptional termination sequences, and signal sequences which direct the polypeptide into the secretory pathways of the target cell.A regulatory region from a “heterologous source” is a regulatory region which is not naturally associated with the expressed nucleic acid. Included among the heterologous regulatory regions are regulatory regions from a different species, regulatory regions from a different gene, hybrid regulatory sequences, and regulatory sequences which do not occur in nature, but which are designed by one having ordinary skill in the art. An “Inducible promoter” refers to those regulated promoters that can be turned on in one or more cell types by an external stimulus or stress, such as a chemical, or light.

[0042] ”Coding sequence” “coding region” or “open reading frame” (ORF) refers to a DNA sequence that codes for a specific amino acid sequence. A coding sequence is “under the control” of transcriptional and translational control sequences in a cell when RNA polymerase transcribes the coding sequence into mRNA, which is then trans-RNA spliced (if the coding sequence contains introns) and translated into the protein encoded by the coding sequence. The term “coding region of interest” refers to a coding region expressible in a cyanobacterial host.

[0043] The term “operably linked” refers to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other. For example, a promoter is operably linked with a coding sequence when it is capable of affecting the expression of that coding sequence (i.e., that the coding sequence is under the transcriptional control of the promoter). Coding sequences can be operably linked to regulatory sequences in sense or antisense orientation.

[0044] The term “gene fusion” refers to the operable linking of at least two functional nucleic acid fragments which are not normally so linked in nature. Gene fusions are often comprised of promoter or regulatory regions operably linked to coding regions of other genes. Gene fusions of the present invention will typically comprise an inducible promoter operably linked to a coding region of interest.

[0045] A “polypeptide” is a polymeric compound comprised of covalently linked amino acid residues. Amino acids have the following general structure:

[0046] Amino acids are classified into seven groups on the basis of the side chain R: (1) aliphatic side chains, (2) side chains containing a hydroxy (OH) group, (3) side chains containing sulfur atoms, (4) side chains containing an acidic or amide group, (5) side chains containing a basic group, (6) side chains containing an aromatic ring, and (7) proline, an imino acid in which the side chain is fused to the amino group. A polypeptide of the invention preferably comprises at least about 14 amino acids.

[0047] A “heterologous protein” refers to a protein not naturally produced in the cell.

[0048] A nucleic acid molecule is “hybridizable” to another nucleic acid molecule, such as a cDNA, genomic DNA, or RNA, when a single stranded form of the nucleic acid molecule can anneal to the other nucleic acid molecule under the appropriate conditions of temperature and solution ionic strength. Hybridization and washing conditions are well known and exemplified in Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor (1989), particularly Chapter 11 and Table 11.1 therein. The conditions of temperature and ionic strength determine the “stringency” of the hybridization. Hybridization requires that the two nucleic acids contain complementary sequences, although depending on the stringency of the hybridization, mismatches between bases are possible. The appropriate stringency for hybridizing nucleic acids depends on the length of the nucleic acids and the degree of complementation, variables well known in the art. The greater the degree of similarity or homology between two nucleotide sequences, the greater the value of Tm for hybrids of nucleic acids having those sequences. The relative stability (corresponding to higher Tm) of nucleic acid hybridizations decreases in the following order: RNA:RNA, DNA:RNA, DNA:DNA. For hybrids of greater than 100 nucleotides in length, equations for calculating Tm have been derived (see Sambrook et al., supra, 9.50-9.51). For hybridizations with shorter nucleic acids, i.e., oligonucleotides, the position of mismatches becomes more important, and the length of the oligonucleotide determines its specificity (see Sambrook et al., supra, 11.7-11.8). Furthermore, the skilled artisan will recognize that the temperature and wash solution salt concentration may be adjusted as necessary according to factors such as length of the probe.

[0049] The term “complementary” is used to describe the relationship between nucleotide bases that are capable to hybridizing to one another. For example, with respect to DNA, adenosine is complementary to thymine and cytosine is complementary to guanine. Accordingly, the instant invention also includes isolated nucleic acid fragments that are complementary to the complete sequences as reported in the accompanying Sequence Listing as well as those substantially similar nucleic acid sequences.

[0050] The term “probe” refers to a single-stranded nucleic acid molecule that can base pair with a complementary single stranded target nucleic acid to form a double-stranded molecule.

[0051] As used herein, the term “oligonucleotide” refers to a nucleic acid, generally of at least 18 nucleotides, that is hybridizable to a genomic DNA molecule, a cDNA molecule, or an mRNA molecule. Oligonucleotides can be labeled, e.g., with ³²P-nucleotides or nucleotides to which a label, such as biotin, has been covalently conjugated. In one embodiment, a labeled oligonucleotide can be used as a probe to detect the presence of a nucleic acid according to the invention. In another embodiment, oligonucleotides (one or both of which may be labeled) can be used as PCR primers, either for cloning full length or a fragment of a nucleic acid of the invention, or to detect the presence of nucleic acids according to the invention. In a further embodiment, an oligonucleotide of the invention can form a triple helix with a DNA molecule. Generally, oligonucleotides are prepared synthetically, preferably on a nucleic acid synthesizer. Accordingly, oligonucleotides can be prepared with non-naturally occurring phosphoester analog bonds, such as thioester bonds, etc.

[0052] The term “expression”, as used herein, refers to the transcription and stable accumulation of sense (mRNA) or antisense RNA derived from the nucleic acid fragment of the invention. Expression may also refer to translation of mRNA into a polypeptide.

[0053] The term “DNA microarray” or “DNA chip” means assembling PCR products of a group of genes or all genes within a genome on a solid surface in a high density format or array. General methods for array construction and use are available (see Schena M, Shalon D, Davis R W, Brown P O., Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. Oct. 20, 1995; 270(5235): 467-70. A DNA microarray allows the analysis of gene expression patterns or profile of many genes to be performed simultaneously by hybridizing the DNA microarray comprising these genes or PCR products of these genes with cDNA probes prepared from the sample to be analyzed. DNA microarray or “chip” technology permits examination of gene expression on a genomic scale, allowing transcription levels of many genes to be measured simultaneously. Briefly, DNA microarray or chip technology comprises arraying microscopic amounts of DNA complementary to genes of interest or open reading frames on a solid surface at defined positions. This solid surface is generally a glass slide, or a membrane (such as nylon membrane). The DNA sequences may be arrayed by spotting or by photolithography. Two separate fluorescently-labeled probe mixes prepared from the two sample(s) to be compared are hybridized to the microarray and the presence and amount of the bound probes are detected by fluorescence following laser excitation using a scanning confocal microscope and quantitated using a laser scanner and appropriate array analysis software packages. Cy3 (green) and Cy5 (red) fluorescent labels are routinely used in the art, however, other similar fluorescent labels may also be employed. To obtain and quantitate a gene expression profile or pattern between the two compared samples, the ratio between the signals in the two channels (red:green) is calculated with the relative intensity of Cy5/Cy3 probes taken as a reliable measure of the relative abundance of specific mRNAs in each sample. Materials for the construction of DNA microarrays are commercially available (Affymetrix (Santa Clara, Calif.), Sigma Chemical Company (St. Louis, Mo.), Genosys (The Woodlands, Tex.), Clontech (Palo Alto, Calif.), and Corning (Corning, N.Y.). In addition, custom DNA microarrays can be prepared by commercial vendors such as Affymetrix, Clontech, and Corning.

[0054] The term “expression profile” refers to the expression of groups of genes.

[0055] The term “gene expression profile” refers to the expression of an individual gene and of suites of individual genes.

[0056] The “comprehensive expression profile” refers to the gene expression profile of more than 75% of all genes in the genome.

[0057] A “vector” or “plasmid” is any means for the transfer of a nucleic acid into a host cell. A vector may be a replicon to which another DNA segment may be attached so as to bring about the replication of the attached segment. A “replicon” is any genetic element (e.g., plasmid, phage, cosmid, chromosome, virus) that functions as an autonomous unit of DNA replication in vivo, i.e., capable of replication under its own control. In general, a “replicon” is a unit length of DNA that replicates sequentially and which comprises an origin of replication. The term “vector” includes both viral and nonviral means for introducing the nucleic acid into a cell in vitro, ex vivo or in vivo. Viral vectors include retrovirus, adeno-associated virus, pox, baculovirus, vaccinia, herpes simplex, Epstein-Barr and adenovirus vectors. Non-viral vectors include plasmids, liposomes, electrically charged lipids (cytofectins), DNA-protein complexes, and biopolymers. In addition to a nucleic acid, a vector may also contain one or more regulatory regions, and/or selectable markers useful in selecting, measuring, and monitoring nucleic acid transfer results (transfer to which tissues, duration of expression, etc.).

[0058] A “cloning vector” is a replicon, such as plasmid, phage or cosmid, to which another DNA segment may be attached so as to bring about the replication of the attached segment. Cloning vectors may be capable of replication in one cell type, and expression in another (“shuttle vector”).

[0059] A “cassette” refers to a segment of DNA that can be inserted into a vector at specific restriction sites. The segment of DNA encodes a polypeptide of interest, and the cassette and restriction sites are designed to ensure insertion of the cassette in the proper reading frame for transcription and translation. “Transformation cassette” refers to a specific vector containing a foreign gene and having elements in addition to the foreign gene that facilitate transformation of a particular host cell. “Expression cassette” refers to a specific vector containing a foreign gene and having elements in addition to the foreign gene that allow for enhanced expression of that gene in a foreign host.

[0060] A cell has been “transfected” by exogenous or heterologous DNA when such DNA has been introduced inside the cell. A cell has been “transformed” by exogenous or heterologous DNA when the transfected DNA effects a phenotypic change. The transforming DNA can be integrated (covalently linked) into chromosomal DNA making up the genome of the cell.

[0061] ”Transformation” refers to the transfer of a nucleic acid fragment into the genome of a host organism, resulting in genetically stable inheritance. Host organisms containing the transformed nucleic acid fragments are referred to as “transgenic” or “recombinant” or “transformed” organisms.

[0062] The term “stress”, “environmental stress”, insult” or “environmental insult” refers to any substance or environmental change that results in an alteration of normal cellular metabolism in a bacterial cell or population of cells. Environmental insults may include, but are not limited to, chemicals, environmental pollutants, heavy metals, changes in temperature, changes in pH, as well as agents producing oxidative damage, DNA damage, anaerobiosis, and changes in nitrate availability or pathogenesis.

[0063] The term “log phase”, “log phase growth”, “exponential phase” or “exponential phase growth” refers to cell cultures of organisms growing under conditions permitting the exponential multiplication of the cell number.

[0064] The term “UV-B light” means light at a wavelength of about 290 nm to about 330 nm.

[0065] The terms “UV-B light treatment”, “UV-B treatment”, “UV-B irradiation” or “UV-B exposure” mean UV-B light that is administered at an intensity of about 20 μES⁻¹ m⁻² to about 80 μES⁻¹ m⁻². Preferably, the UV-B light is administered at an intensity of about 20 μES⁻¹ m⁻².

[0066] The terms “UV-inducible” or “UV-B-inducible” gene or promoter refer to a gene or promoter whose expression or induction increases upon exposure to UV-B light.

[0067] In a specific embodiment, the term “about” or “approximately” means within 20%, preferably within 10%, and more preferably within 5% of a given value or range.

[0068] Standard recombinant DNA and molecular cloning techniques used here are well known in the art and are described by Sambrook, J., Fritsch, E. F. and Maniatis, T., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989) (hereinafter “Maniatis”); and by Silhavy, T. J., Bennan, M. L. and Enquist, L. W., Experiments with Gene Fusions, Cold Spring Harbor Laboratory Cold Press Spring Harbor, N.Y. (1984); and by Ausubel, F. M. et al., Current Protocols in Molecular Biology, published by Greene Publishing Assoc. and Wiley-lnterscience (1987).

[0069] DNA Microarray Analysis

[0070] The present invention provides methods for gene expression and regulation in cyanobacteria using the promoter regions from genes that are either highly expressed in log phase growth or under the influence of UV-B light. The present promoters were identified using DNA microarray technology.

[0071] It will appreciated that in order to measure the transcription level (and thereby the expression level) of a gene or genes, it is desirable to provide a nucleic acid sample comprising mRNA transcript(s) of the gene or genes, or nucleic acids derived from the mRNA transcript(s). As used herein, a nucleic acid derived from an mRNA transcript refers to a nucleic acid for whose synthesis the mRNA transcript or a subsequence thereof has ultimately served as a template. Thus, a cDNA reverse transcribed from an mRNA, an RNA transcribed from that cDNA, a DNA amplified from the cDNA, an RNA transcribed from the amplified DNA, etc., are all derived from the mRNA transcript and detection of such derived products is indicative of the presence and/or abundance of the original transcript in a sample. Thus, suitable samples include, but are not limited to, mRNA transcripts of the gene or genes, cDNA reverse transcribed from the mRNA, cRNA transcribed from the cDNA, DNA amplified from the genes, RNA transcribed from amplified DNA, and the like.

[0072] Typically the genes are amplified by methods of primer directed amplification such as polymerase chain reaction (PCR) (U.S. Pat. No. 4,683,202 (1987, Mullis, et al.) and U.S. Pat. No.4,683,195 (1986, Mullis, et al.), ligase chain reaction ( LCR) (Tabor et al., Proc. Acad. Sci. U.S.A., 82, 1074-1078 (1985)) or strand displacement amplification (Walker et al., Proc. Natl. Acad. Sci. U.S.A., 89, 392, (1992) for example.

[0073] The micro-array is comprehensive in that it incorporates at least 75% of all ORF's present in the genome. Amplified ORF's are then spotted on slides comprised of glass or some other solid substrate by methods well known in the art to form a micro-array. Methods of forming high density arrays of oligonucleotides, with a minimal number of synthetic steps are known (see for example Brown et al., U.S. Pat. No. 6,110,426). The oligonucleotide analogue array can be synthesized on a solid substrate by a variety of methods, including, but not limited to, light-directed chemical coupling, and mechanically directed coupling. See Pirrung et al., U.S. Pat. No. 5,143,854 (see also PCT Application No. WO 90/15070) and Fodor et al., PCT Publication Nos. WO 92/10092 and WO 93/09668 which disclose methods of forming vast arrays of peptides, oligonucleotides and other molecules using, for example, light-directed synthesis techniques. See also, Fodor et al., Science, 251, 767-77 (1991).

[0074] Bacteria typically contain from about 2000 to about 6000 ORF's per genome and the present method is suitable for genomes of this size where genomes of about 4000 ORF's are most suitable. The ORF's are arrayed in high density on at least one glass microscope slide. This is in contrast to a low density array where ORF's are arrayed on a membranous material such as nitrocellulose. The small surface area of the high density array (often less than about 10 cm², preferably less than about 5 cm² more preferably less than about 2 cm², and most preferably less than about 1.6 cm.²) permits extremely uniform hybridization conditions (temperature regulation, salt content, etc.).

[0075] Once all the genes of ORF's from the genome are amplified, isolated and arrayed, a set of probes, bearing a signal-generating label are synthesized. Probes may be randomly generated or may be synthesized based on the sequence of specific open reading frames. Probes of the present invention are typically single stranded nucleic acid sequences which are complementary to the nucleic acid sequences to be detected. Probes are “hybridizable” to the ORF's. The probe length can vary from 5 bases to tens of thousands of bases, and will depend upon the specific test to be done. Typically a probe length of about 15 bases to about 30 bases is suitable. Only part of the probe molecule need be complementary to the nucleic acid sequence to be detected. In addition, the complementarity between the probe and the target sequence need not be perfect. Hybridization does occur between imperfectly complementary molecules with the result that a certain fraction of the bases in the hybridized region are not paired with the proper complementary base.

[0076] Signal-generating labels that may be incorporated into the probes are well known in the art. For example labels may include but are not limited to fluorescent moieties, chemiluminescent moieties, particles, enzymes, radioactive tags, or light emitting moieties or molecules, where fluorescent moieties are preferred. Most preferred are fluorescent dyes capable of attaching to nucleic acids and emitting a fluorescent signal. A variety of dyes are known in the art such as fluorescein, Texas red, and rhodamine. Preferred in the present invention are the mono reactive dyes cy3 (146368-16-3) and cy5 (146368-14-1) both available commercially (i.e., Amersham Pharmacia Biotech, Arlington Heights, Ill.). Suitable dyes are discussed in U.S. Pat. No. 5,814,454 hereby incorporated by reference.

[0077] Labels may be incorporated by any of a number of means well known to those of skill in the art. However, in a preferred embodiment, the label is simultaneously incorporated during the amplification step in the preparation of the probe nucleic acids. Thus, for example, polymerase chain reaction (PCR) with labeled primers or labeled nucleotides will provide a labeled amplification product. In a preferred embodiment, reverse transcription or replication, using a labeled nucleotide (e.g. dye-labeled UTP and/or CTP) incorporates a label into the transcribed nucleic acids.

[0078] Alternatively, a label may be added directly to the original nucleic acid sample (e.g., mRNA, polyA mRNA, cDNA, etc.) or to the amplification product after the synthesis is completed. Means of attaching labels to nucleic acids are well known to those of skill in the art and include, for example nick translation or end-labeling (e.g. with a labeled RNA) by kinasing of the nucleic acid and subsequent attachment (ligation) of a nucleic acid linker joining the sample nucleic acid to a label (e.g., a fluorophore).

[0079] Following incorporation of the label into the probe the probes are then hybridized to the micro-array using standard conditions where hybridization results in a double stranded nucleic acid, generating a detectable signal from the label at the site of capture reagent attachment to the surface. Typically the probe and array must be mixed with each other under conditions which will permit nucleic acid hybridization. This involves contacting the probe and array in the presence of an inorganic or organic salt under the proper concentration and temperature conditions. The probe and array nucleic acids must be in contact for a long enough time that any possible hybridization between the probe and sample nucleic acid may occur. The concentration of probe or array in the mixture will determine the time necessary for hybridization to occur. The higher the probe or array concentration the shorter the hybridization incubation time needed. Optionally a chaotropic agent may be added. The chaotropic agent stabilizes nucleic acids by inhibiting nuclease activity. Furthermore, the chaotropic agent allows sensitive and stringent hybridization of short oligonucleotide probes at room temperature [Van Ness and Chen (1991) Nucl. Acids Res. 19:5143-5151]. Suitable chaotropic agents include guanidinium chloride, guanidinium thiocyanate, sodium thiocyanate, lithium tetrachloroacetate, sodium perchlorate, rubidium tetrachloroacetate, potassium iodide, and cesium trifluoroacetate, among others. Typically, the chaotropic agent will be present at a final concentration of about 3 M. If desired, one can add formamide to the hybridization mixture, typically 30-50% (v/v).

[0080] Various hybridization solutions can be employed. Typically, these comprise from about 20 to 60% volume, preferably 30%, of a polar organic solvent. A common hybridization solution employs about 30-50% v/v formamide, about 0.15 to 1 M sodium chloride, about 0.05 to 0.1 M buffers, such as sodium citrate, Tris-HCl, PIPES or HEPES (pH range about 6-9), about 0.05 to 0.2% detergent, such as sodium dodecylsulfate, or between 0.5-20 mM EDTA, FICOLL (Pharmacia Inc.) (about 300-500 kilodaltons), polyvinylpyrrolidone (about 250-500 kdal), and serum albumin. Also included in the typical hybridization solution will be unlabeled carrier nucleic acids from about 0.1 to 5 mg/mL, fragmented nucleic DNA, e.g., calf thymus or salmon sperm DNA, or yeast RNA, and optionally from about 0.5 to 2% wt./vol. glycine. Other additives may also be included, such as volume exclusion agents which include a variety of polar water-soluble or swellable agents, such as polyethylene glycol, anionic polymers such as polyacrylate or polymethylacrylate, and anionic saccharidic polymers, such as dextran sulfate. Methods of optimizing hybridization conditions are well known to those of skill in the art (see, e.g., Laboratory Techniques in Biochemistry and Molecular Biology, Vol. 24: Hybridization With Nucleic Acid Probes, P. Tijssen, ed. Elsevier, N.Y., (1993)) and Maniatis, supra.

[0081] The basis of gene expression profiling via micro-array technology relies on comparing an organism under a variety of conditions that result in alteration of the genes expressed. Within the context of the present invention a single population of cells was exposed to a variety of stresses that resulted in the alteration of gene expression. Specifically, expression was monitored under the conditions of exposure to UV-B light and log phase growth. Non-stressed cells are used for generation of “control” arrays and stressed cells are used to generate an “experimental”, “stressed” or “induced” arrays. Using these methods it was determined that the genes amiC and rbcX encoding a putative periplasmic binding protein and a putative chaperone respectively, were highly induced in log phase growth. Similarly, under the stress of UV-B light it was determined that hliB, hsp17, nblB, rpoD, hliA, ftsH, and the clpB genes were highly induced.

[0082] Nucleic Acids of the Invention

[0083] Two sets of high level expression (i.e., strong) promoters from cyanobacteria Synechocystis sp. PCC6803 have been identified using the above described DNA microarray technology. One set of promoters were derived from the amiC and rbcX genes and have been shown to be highly expressed in log phase growth. The second set of promoters were induced by UV-B light and consist of the genes hliB, hsp17, nblB, rpoD, hliA, ftsH, and clpB.

[0084] The amiC gene has putatively been identified as encoding a periplasmic binding protein based on sequence comparison to similar gene in public databases. amiC has been identified in Pseudomonas as being the contoller transcription antitermination in the amidase operon (Pearl et al., EMBO J. (1994), 13(24), 5810-17 and in Synechocystis (Kaneko et al., Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions, DNA Res. 3 (3), 109-136 (1996).

[0085] The rbcX gene has been putatively identified as a chaperone based on sequence comparisons to publicly available databases. rbcX has been identified in Synechocystis (Kaneko et al., supra) and in filamentous cyanobacteria of the genus Anabaena (Li et al. J. Bacteriol. (1997), 179(11), 3793-3796) as well as Microcystis, Tychonema, Planktothrix and Nostoc (Rudi et al., J. Bacteriol. (1998), 180(13), 3453-3461) and is thought to encode a protein that facilitates protein folding for ribulose 1,5-bisphosphate carboxylate/oxygenase. In addition to the amiC and rbcX genes, genes of unknown function have been identified as being highly induced in log phase. The most significant gene in this category has nucleic acid and amino acid sequences as set for in SEQ ID NOs:5 and 6 respectively.

[0086] Although both amiC and rbcX are known, it was not until Applicant's invention that is was appreciated that these genes were induced at high levels in the log phase and offer the promise of high level gene expression for gene fusions in cyanobacteria.

[0087] hliB and hliC have been identified as genes inducible by high light in Synechocystis (Kaneko et al., supra) and homologs have been found in higher plants and red algae (Jansson et al., Plant Molecular Biology, (January, 2000) Vol. 42, No. 2, pp. 345-351). It is thought that the hli gene product may bind chlorophyll and form dimers in the thylakoid membrane of the photosystem II complex.

[0088] hsp17 is well known to be highly expressed in response to heat stress. hsp17 is present in Synechocystis (Kaneko et al., supra) and Synechococcus (Nishiyama et al., Plant Physiology (Rockville), (May, 1999) Vol. 120, No. 1, pp. 301-308). It has been suggested that in the cyanobacteria hsp17 may play are role in the thylakoid fluidity levels of the cell membrane (Horvath et al., Proceedings of the National Academy of Sciences of the United States of America, (Mar. 31, 1998) Vol. 95, No. 7, pp. 3513-3518).

[0089] nblB has been identified in the complete genome of Synechocystis (Kaneko et al., supra) and is thought to play a role in the degradation of the light harvesting, electron transport complex phycobilisome (Dolganov et al., Journal of Bacteriology, (January, 1999) Vol. 181, No. 2, pp. 610-617).

[0090] rpoD has been identified in the complete genome of Synechocystis (Kaneko et al., supra) and is a sigma factor of chloroplast RNA polymerase used in rhodophytes (Liu et la., Journal of Phycology, (August, 1999) Vol. 35, No. 4, pp. 778-785) and other cyanobacteria (Asayama et al., Journal of Biochemistry (Tokyo), (March, 1999) Vol.125, No. 3, pp.; Caslake et al., Microbiology (Reading), (December, 1997) Vol. 143, No. 12, pp. 3807-3818; Tanaka et al., Biosci Biotechnol Biochem, (1992) 56 (7), 1113-1117).

[0091] ftsH gene has been identified in the complete genome of Synechocystis (Kaneko et al., supra), in red algae (Itoh et al., Plant Molecular Biology, (October, 1999) Vol. 41, No. 3, pp. 321-337) in E. coli., (Jayasekera et al., Archives of Biochemistry and Biophysics, (Aug. 1, 2000) Vol. 380, No. 1) and in higher plants such as tobacco (Seo et al., Plant Cell, (June, 2000) Vol. 12, No. 6, pp. 917-932). The gene product of ftsH is a metalloprotease bound to the thylakoid membrane, and degrades unassembled proteins and is involved in the degradation of the D1 protein.(Adam, Z., Biochimie (Paris), (June July, 2000) Vol. 82, No. 6-7, pp. 647-654).

[0092] The clpB gene has been identified in the complete genome of Synechocystis (Kaneko et al., supra) and in other cyanobacteria and is thought to play a role in acquired thermotolerance (Keeler et al., Plant Physiology (Rockville), (July, 2000) Vol. 123, No. 3, pp. 1121-1132).

[0093] In addition to the above mentioned UV-B inducible genes, genes of unknown function have been identified as being highly induced by UV-B light. The most significant genes in this category have nucleic acid and amino acid sequences as set for in SEQ ID NOs:9 and 10, 11 and 12, 17 and 18, 21 and 22, 25 and 26, 31 and 32 and 39 and 40, respectively.

[0094] These genes, although known in a variety of cyanobacteria and higher plants, are responsive to a diverse array of induction triggers. However, until Applicant's invention it was not appreciated that all such genes may be highly induced when the host cell is exposed to UV-B light. It will be appreciated that although these observations were made with genes isolated from the cyanobacteria Synechocystis sp. PCC6803, it will be expected that homologues of these genes in similar organisms, including higher plants will behave in a similar fashion. Homologues of these genes are those genes having similar function in related organisms and may have significant nucleotide or amino acid sequence homology over some or all of the sequence. Homologues having significant sequence homology may be identified by means well known in the art. Examples of sequence-dependent protocols for homologue identification include, but are not limited to, methods of nucleic acid hybridization, and methods of DNA and RNA amplification as exemplified by various uses of nucleic acid amplification technologies [e.g. polymerase chain reaction, Mullis et al., U.S. Pat. No. 4,683,202; ligase chain reaction (LCR), Tabor, S. et al., Proc. Acad. Sci. USA 82, 1074, (1985)] or strand displacement amplification [SDA, Walker, et al., Proc. Natl. Acad. Sci. U.S.A., 89, 392, (1992)].

[0095] Generally two short segments of the instant sequences may be used in polymerase chain reaction protocols to amplify longer nucleic acid fragments encoding homologous genes from DNA or RNA. The polymerase chain reaction may also be performed on a library of cloned nucleic acid fragments wherein the sequence of one primer is derived from the instant nucleic acid fragments, and the sequence of the other primer takes advantage of the presence of the polyadenylic acid tracts to the 3′ end of the mRNA precursor encoding microbial genes.

[0096] Alternatively the instant sequences may be employed as hybridization reagents for the identification of homologues. The basic components of a nucleic acid hybridization test include a probe, a sample suspected of containing the gene or gene fragment of interest, and a specific hybridization method. Probes of the present invention are typically single stranded nucleic acid sequences which are complementary to the nucleic acid sequences to be detected. Probes are “hybridizable” to the nucleic acid sequence to be detected. The probe length can vary from 5 bases to tens of thousands of bases, and will depend upon the specific test to be done. Typically a probe length of about 15 bases to about 30 bases is suitable. Only part of the probe molecule need be complementary to the nucleic acid sequence to be detected. In addition, the complementarity between the probe and the target sequence need not be perfect. Hybridization does occur between imperfectly complementary molecules with the result that a certain fraction of the bases in the hybridized region are not paired with the proper complementary base. Hybridization methods are well defined and have been described above.

[0097] Coding region of Interest

[0098] In a specific embodiment of Applicants' invention, the coding region of interest may be either endogenous or heterologous to the cyanobacterium host cell. Any coding region that may be fused to the promoter regions of the invention and which will be expressed in a cyanobacterial host are suitable. Coding regions derived from genes that have commercial significance are preferred. A particularly preferred, but non-limiting list include, genes encoding enzymes involved in the production of isoprenoid molecules, genes encoding polyhydroxyalkanoic acid (PHA) synthases (phaE; GenBank®Accession No. GI 1652508, phaC; GenBank®Accession No. GI 1652509) from Synechocystis or other bacteria, genes encoding carotenoid pathway genes such as phytoene synthase (crtB; GenBank®Accession No. GI 1652930), phytoene desaturase (crtD; GenBank®Accession No. GI 1652929), beta-carotene ketolase (crtO; GenBank®Accession No. GI 1001724); and the like, ethylene forming enzyme (efe) for ethylene production, pyruvate decarboxylase (pdc), alcohol dehydrogenase (adh), cyclic terpenoid synthases (i.e. limonene synthase, pinene synthase, bornyl synthase, phellandrene synthase, cineole synthase, and sabinene synthase) for the production of terpenoids, and taxadiene synthase for the production of taxol, and the like. Genes encoding enzymes involved in the production of isoprenoid molecules include for example, geranylgeranyl pyrophosphate synthase (crtE; GenBank® Accession No. GI 1651762), solanesyl diphosphate synthase (sds; GenBank® Accession No. GI 1651651), which can be expressed in Synechocystis to exploit the high flux for the isoprenoid pathway in this organism. Genes encoding polyhydroxyalkanoic acid (PHA) synthases (phaE, phaC) may be used for the production of biodegradable plastics.

[0099] Microbial Expression

[0100] Once a coding region of interest has been identified a fusion with the appropriate inducible promoter region may be constructed by means well known in the art. Gene expression protocols are similar in Synechocystis and other bacteria (Maniatis, et al. supra; Donald A Bryant, The Molecular Biology of Cyanobacteria, Kluwer Academic Publisher, 1994), except the growth requirements are different (see Rippka et al., 1979, supra). Typically synechocystis is grown in BG11 media (Sigma) containing 5 mM glucose, at 30° C. illuminated with 15-50 μES⁻¹ m⁻² white light. The synechocystis cell culture is grown to mid logarithmic state, before an inducer (such as UV-B, or isopropyl thio-β-galactopyranoside) is added to induce protein expression.

[0101] Vectors or cassettes useful for the transformation of suitable host cells are well known in the art. Typically the vector or cassette contains sequences directing transcription and translation of the relevant gene, a selectable marker, and sequences allowing autonomous replication or chromosomal integration. Suitable vectors comprise a region 5′ of the gene which harbors transcriptional initiation controls and a region 3′ of the DNA fragment which controls transcriptional termination. It is most preferred when both control regions are derived from genes homologous to the transformed host cell, although it is to be understood that such control regions need not be derived from the genes native to the specific species chosen as a production host. There are two kinds of preferred vectors for use in Synechocystis: self-replicating plasmids and chromosome integration plasmids. The self-replicating plasmids have the advantage of having multiple copies of coding regions of interest, and therefore the expression level can be very high. Chromosome integration plasmids are integrated into the genome by recombination. They have the advantage of being stable, but they may suffer from a lower level of expression. A specific embodiment of the present invention provides that the genetic construct resides on a plasmid in the transformed cyanobacterium. Alternatively, the genetic construct may be chromosomally integrated in the cyanobacterium genome.

[0102] Termination control regions may also be derived from various genes native to the preferred hosts. Optionally, a termination site may be unnecessary, however, it is most preferred if included.

[0103] Suitable host cells for use with the methods and promoters of the invention will include genera in the cyanobacterial family. Preferred host will include, but are not limited to the genera Asterocapsa Aphanizomenon Microcystis Cylindrospermum Anacystis, Psychrophilic Anabaena Nostoc, Tychonema, Planktothrix Lyngbya Schizothrix Nodularia Synechocystis and Synechococcus where the genera Synechocystis and Synechococcus are most preferred.

[0104] Synechocystis sp. PCC6803, a naturally competent host for transformation. DNA is directly added to actively growing cells, and plated on a selective media with the appropriate antibiotic marker. Expression of desired gene products involves growing the transformed host cells in illumination of 15-50 μEs⁻¹ m⁻² intensity of white light at 30° C., inducing expression of the transformed gene with an inducing agent, e.g., UV-B light or a chemical inducer, until cells reach a high density, e.g., optical density (OD)_(730nm)=4. Cells are harvested and gene products are isolated according to protocols specific for the gene product. Other host cells may also be used within the scope of the invention, including but not limited to other species of Synechocystis, Synechococcus species, other cyanobacteria, and the like.

[0105] Culture Conditions

[0106] Once a gene fusion comprising an inducible promoter region operably linked to a coding region of interest is inserted into an appropriate host cell, the expression of the coding region may be controlled by regulating the inducer. In the case of a fusion comprising the amiC or rbcX gene the cells need only be grown in the log phase for induction and expression to occur. Where the fusion comprises any of the UV-B light inducible promoter regions, the cultures must be exposed to a suitable UV-B wavelength and at a suitable intensity. Wavelengths of about 290 nm to about 330 nm are preferred and a light intensity of about 20 μES⁻¹ m⁻² to about 80 μES⁻¹ m⁻² is suitable

[0107] Where commercial production of a protein encoded by a gene fusion is desired a variety of culture methodologies may be applied. For example, large scale production of a specific gene product, overexpressed from a recombinant microbial host may be produced by both Batch or continuous culture methodologies.

[0108] A classical batch culturing method is a closed system where the composition of the media is set at the beginning of the culture and not subject to artificial alterations during the culturing process. Thus, at the beginning of the culturing process the media is inoculated with the desired organism or organisms and growth or metabolic activity is permitted to occur adding nothing to the system. Typically, however, a “batch” culture is batch with respect to the addition of carbon source and attempts are often made at controlling factors such as pH and oxygen concentration. In batch systems the metabolite and biomass compositions of the system change constantly up to the time the culture is terminated. Within batch cultures cells moderate through a static lag phase to a high growth log phase and finally to a stationary phase where growth rate is diminished or halted. If untreated, cells in the stationary phase will eventually die. Cells in log phase are often responsible for the bulk of production of end product or intermediate in some systems. Stationary or post-exponential phase production can be obtained in other systems.

[0109] A variation on the standard batch system is the Fed-Batch system. Fed-Batch culture processes are also suitable in the present invention and comprise a typical batch system with the exception that the substrate is added in increments as the culture progresses. Fed-Batch systems are useful when catabolite repression is apt to inhibit the metabolism of the cells and where it is desirable to have limited amounts of substrate in the media. Measurement of the actual substrate concentration in Fed-Batch systems is difficult and is therefore estimated on the basis of the changes of measurable factors such as pH, dissolved oxygen and the partial pressure of waste gases such as CO₂. Batch and Fed-Batch culturing methods are common and well known in the art and examples may be found in Thomas D. Brock in Biotechnology: A Textbook of Industrial Microbiology, Second Edition (1989) Sinauer Associates, Inc., Sunderland, Mass., or Deshpande, Mukund V., Appl. Biochem. Biotechnol., 36, 227, (1992), herein incorporated by reference.

[0110] Alternatively, commercial production of proteins encoded by the instant gene fusions may also be accomplished with a continuous culture. Continuous cultures are an open system where a defined culture media is added continuously to a bioreactor and an equal amount of conditioned media is removed simultaneously for processing. Continuous cultures generally maintain the cells at a constant high liquid phase density where cells are primarily in log phase growth. Alternatively continuous culture may be practiced with immobilized cells where carbon and nutrients are continuously added, and valuable products, by-products or waste products are continuously removed from the cell mass. Cell immobilization may be performed using a wide range of solid supports composed of natural and/or synthetic materials.

[0111] Continuous or semi-continuous culture allows for the modulation of one factor or any number of factors that affect cell growth or end product concentration. For example, one method will maintain a limiting nutrient such as the carbon source or nitrogen level at a fixed rate and allow all other parameters to moderate. In other systems a number of factors affecting growth can be altered continuously while the cell concentration, measured by media turbidity, is kept constant. Continuous systems strive to maintain steady state growth conditions and thus the cell loss due to media being drawn off must be balanced against the cell growth rate in the culture. Methods of modulating nutrients and growth factors for continuous culture processes as well as techniques for maximizing the rate of product formation are well known in the art of industrial microbiology and a variety of methods are detailed by Brock, supra.

EXAMPLES

[0112] The present invention is further defined in the following Examples. It should be understood that these Examples, while indicating preferred embodiments of the invention, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various uses and conditions.

[0113] General Methods

[0114] Standard recombinant DNA and molecular cloning techniques used herein are well known in the art and are described by Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory Press: Cold Spring Harbor, (1989) (Maniatis) and by T. J. Silhavy, M. L. Bennan, and L. W. Enquist, Experiments with Gene Fusions, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1984) and by Ausubel, F. M. et al., Current Protocols in Molecular Biology, pub. by Greene Publishing Assoc. and Wiley-Interscience (1987).

[0115] Materials and methods suitable for the maintenance and growth of bacterial cultures are well known in the art. Techniques suitable for use in the following examples may be found as set out in Manual of Methods for General Bacteriology (Phillipp Gerhardt, R. G. E. Murray, Ralph N. Costilow, Eugene W. Nester, Willis A. Wood, Noel R. Krieg and G. Briggs Phillips, eds), American Society for Microbiology, Washington, D.C. (1994)) or by Thomas D. Brock in Biotechnology: A Textbook of Industrial Microbiology, Second Edition, Sinauer Associates, Inc., Sunderland, Mass. (1989). All reagents, restriction enzymes and materials used for the growth and maintenance of bacterial cells were obtained from Aldrich Chemicals (Milwaukee, Wis.), DIFCO Laboratories (Detroit, Mich.), GIBCO/BRL (Gaithersburg, Md.), or Sigma Chemical Company (St. Louis, Mo.) unless otherwise specified.

[0116] Synechocystis sp. PCC6803 used in the following examples is available from the American Type Culture Collection, accession number ATCC 27184 (Ripka et al., 1979. J. Gen. Micro., 111:1-61).

[0117] Synechocystis sp. PCC6803 DNA Microarray Preparation

[0118] Synechocystis DNA microarray slides were prepared using a Molecular Dynamics Genll Spotter (Molecular Dynamics, Sunnyvale Calif.). A collection of purified PCR products of all Synechocystis open reading frames were transferred from 384 well microtiter plates to microarray glass slides using the GenIII spotter. The spotted slides were stored in desiccated container at room temperature where they were stable for about three months.

[0119] Hybridization of Microarray Slides and Quantitation of Gene Expression

[0120] Microarray glass slides (Molecular Dynamics, Sunnyvale Calif.) were treated with 100% isopropanol for 10 min, boiling double distilled water for 5 min, then treated with blocking buffer (3.5×SSC, 0.2% SDS, 1% BSA) for 20 min at 60° C., rinsed five times with double distilled water, then twice with isopropanol, followed by drying under nitrogen. Typically 100 picomoles of Cy3 labeled cDNA probes were prepared from total RNA isolated from the UV-B treated Synechocystis culture and mixed with an equal amount of Cy5 labeled cDNA probes prepared from total RNA isolated from the untreated Synechocystis culture. These were applied to a glass slide in a total volume of 30 μL. The hybridization was repeated using 100 picomoles of Cy5 labeled cDNA probes prepared from total RNA isolated from UV-B treated Synechocystis culture mixed with an equal amount of Cy3 labeled cDNA probes prepared from total RNA isolated from the untreated culture. These were applied to a second glass slide in a total volume of 30 μL. The hybridization reactions on the glass slides were incubated for 16 hr at 42° C., in a humidified chamber. Hybridized slides were washed in 1×SSC (0.15 M NaCl, 0.015 M sodium citrate), 0.1% SDS for 5 min at42° C.; 0.1×SSC, 0.1% SDS for 5 min at 42° C.; three washes in 0.1×SSC for 2 min at room temperature; rinsed with double distilled water and then with 100% isopropanol; and dried under nitrogen. The slides were scanned using a Molecular Dynamics laser scanner for imaging of Cy3 and Cy5 labeled cDNA probes. The images were analyzed using Array Vision Software (Molecular Dynamics, Sunnyvale, Calif.) to obtain fluorescence signal intensities of each spot (each ORF on the array) to quantitate gene expression. The normalized ratio between the signals in the two channels (red:green) is calculated and the relative intensity of Cy5/Cy3 probes for each spot represents the relative abundance of specific mRNAs in each sample

[0121] Minimal media was used in many of the cultures of the following examples and means a growth media composed of various salts required for the growth of the microbial/bacterial strain. In general, minimal media lacks amino acids, peptides, and sugars, and is commercially available from GIBCO (Grand Rapids Mich.).

[0122] The meaning of abbreviations is as follows: “h” and “hr” mean hour(s), “min” means minute(s), “sec” or “s” mean second(s), “d” means day(s), “mL” means milliliters, “L” means liters, “μg” means micrograms, “mg” means milligrams, “pmol” means pico moles, “μM” means micromolar, “mM” means millimolar, “M” means molar, “nm” means nanometer(s), “m” means meter(s), “OD” means optical density, “rpm” means revolutions per minute, and “μE” means microeinstein(s), wherein 1 μE equals 10⁻⁶ moles of photons.

Example 1 Preparation of Synechocystis sp. PCC6803 cDNA Probes

[0123] Example 1 describes the construction of Synechocystis sp. PCC6803 cDNA probes following growth of the cells in either minimal growth media (control) or minimal media plus UV-B light treatment. The cDNA probes were used to determine gene expression patterns of many genes simultaneously on a Synechocystis sp. PCC6803 DNA microarray as described in Examples 2 and 3 below.

[0124] Synechocvstis Strain and Culture Methods

[0125] Briefly, Synechocystis sp. PCC6803 cells were grown at 30 μES⁻¹ m⁻² light intensity in a minimal growth media, BG-11 (Rippka, R., Deruelles, J., Waterbury, J. B., Herdman, M., Stanier, R. Y. (1979) J. Ben. Microbiol. 111, 1-61)) at 30° C., with shaking at 100 rpm. Fifty milliliters of Synechocystis cells grown to mid logarithmic phase (OD_(730nm)=0.8 to 1.0) were divided into two 25 mL cultures and transferred from the Erlenmeyer growth flask to two 100 mL plastic Petri dishes. The Petri dishes were placed on a rotary shaker and shaken at 100 rpm.

[0126] Cell Treatments

[0127] For the control, the Petri dishes comprising the Synechocystis cells were placed on a rotary shaker with the lids on, and shaken at 100 rpm for 20 min or 2 hr. For the UV-B treated group, the Petri dishes comprising the Synechocystis cells were placed on a rotary shaker with the lids on, and shaken at 100 rpm for 20 min or 2 hr. A UV-B lamp (UVM-28, mid range at 302 nm, Ultra Violet Products, Upland, Calif.) was positioned above the Petri dishes and the distance between the UV-B light source and the Petri dishes was adjusted to give the desired level of UV-B light intensity. The level of UV-B light intensity was measured at the surface of the cell culture using UVX-31 radiometer (Ultra-Violet Products, Upland, Calif.), following the manufacturer's instructions. UV-B treatment was performed with the lid on for either 20 min or 120 min. Following UV-B irradiation, the cells were immediately cooled on ice and their RNA isolated as described below.

[0128] Total RNA Isolation and cDNA Probe Synthesis

[0129] Control Synechocystis cells and UV-B treated Synechocystis cells were cooled rapidly on ice and centrifuged at 3200×g for 5 min. Total RNA samples were isolated using Qiagen RNeasy® Mini Kit (Qiagen, Valencia, Calif.), following the manufacturer's protocol. RNase A digestion was performed according to the manufacturers instructions, and a second round of purification was performed using the RNeasy® Mini Kit. The purified total RNA was analyzed by agarose gel electrophoresis.

[0130] From each total RNA preparation, both Cy3 and Cy5 florescent dye labeled cDNA probes were prepared. To synthesize the Cy3 or Cy5 labeled cDNA probes, a reverse transcription reaction was performed using 10 μg total RNA, 12 μg random hexamer (Ambion, Austin, Tex.), 50 μM of dATP, dGTP, dTTP, 25 μM of dCTP, and 15 μM Cy3-dCTP or 22 μM Cy5-dCTP (Amersham Pharmacia Biotech, Piscataway N.J.), 10 mM DTT, 50 mM Tris-HCl pH 8.3, 75 mM KCl, 15 mM MgCl₂ and 4 units of AMV reverse transcriptase (Gibco BRL-Life Technologies, Rockville, Md.)) in total volume of 40 μL. The reaction was carried out at 42° C. for 2.5 hr. After the labeling reaction, RNA templates were degraded by alkaline hydrolysis and the cDNA probes were purified using Qiagen PCR purification kit. The purified probes were quantitated by measuring the absorbance at 260 nm, 550 nm (Cy5 dye incorporation) and 650 nm (Cy3 dye incorporation). Prior to hybridization, 100-200 pmol of the purified Cy3 or Cy5 labeled cDNA probes were dried under vacuum, and re-dissolved in the hybridization buffer (5×SSC, 50% formamide, 0.1% SDS, and 0.03 mg/mL salmon sperm DNA).

Example 2 Analysis of Synechocystis sp. PCC6803 Gene Expression in Minimal Media

[0131] Example 2 describes the identification of the most highly expressed genes and their corresponding strong promoters in Synechocystis sp. PCC6803 when grown in log phase in BG11 media containing 5 mM glucose as described above.

[0132] Specifically, a DNA microarray was prepared according to the methods described above using PCR amplified open reading frames and using genomic Synechocystis sp. PCC6803 DNA as template. Synechocystis sp. PCC6803 gene expression was determined by hybridizing this DNA microarray as described above with fluorescent cDNA probes synthesized from total RNA isolated from Synechocystis sp. PCC6803 cells grown in BG11 media containing 5 mM glucose as described in Example 1.

[0133] Briefly, for each duplicated minimal media experiment, two hybridization reactions were performed as described above. Specifically, the first reaction used equal molar (typically 100-200 pmol incorporated florescent dye) of Cy5-labeled cDNA from total RNA of the minimal media grown sample, and Cy3-labeled cDNA probes from the same sample. The second reaction used both Cy5 and Cy3-labeled cDNA synthesized from Synechocystis sp. PCC6803 genomic DNA. The signal intensities were quantitated as described above. To calculate the relative expression level of each Synechocystis gene in cells grown in the minimal media, the average normalized signal intensity of the hybridized cDNA probes was divided by the average signal intensity of the hybridized cDNA probes from genomic DNA. Analysis of the data from these microarray experiments indicated that the most highly expressed genes, i.e., those genes that are under the control of the strongest promoters, in Synechocystis grown in log phase under these minimal media conditions (see Table 1). TABLE 1 Most highly expressed genes in Synechocystis sp. PCC6803 in minimal growth media (BG11 + 5 mM glucose). Transcript copy in total mRNA Systematic (Average SEQ ID NO: Name Gene Function copy = 1) NA** AA*** slr2051 cpcG Phycobilisome rod-core linker 64.91 polypeptide CpcG sll1580 cpcC Phycocyanin associated linker protein 22.71 slr0447 amiC Putative periplasmic binding protein 19.45 1 2 sll1070 tktA Transketolase 19.24 sll0018 cbbA Fructose-1, 6-bisphosphate aldolase 14.27 slr0011 rbcX Putative chaperone 12.00 3 4 ssl0563 psaC photosystem I subunit VII 11.31 slr1655 psaL photosystem I subunit XI 10.91 sll0819 psaF photosystem I subunit III 10.56 sll1867 psbA3 photosystem II D1 protein 10.43 sll1324 atpF ATP synthase subunit b 10.37 sll1746 rpl12 50S ribosomal protein L12 10.13 sll1099 tufA protein synthesis elongation factor Tu 9.48 slr0009 rbcL ribulose bisphosphate carboxylase large 8.39 subunit slr0012 rbcS ribulose bisphosphate carboxylase small 8.14 subunit sll1326 atpA ATP synthase a subunit 7.72 slr1908 ND* 7.62 sll1578 cpcA phycocyanin a subunit 7.60 slr2067 apcA allophycocyanin a chain 7.51 slr2052 ND* 7.41 sll1184 ho heme oxygenase 7.27 ssl3437 rps17 30S ribosomal protein S17 7.26 sll1786 hypothetical protein (ND*) 7.16 5 6 ssl0020 petF ferredoxin 7.07 sll1812 rps5 30S ribosomal protein S5 7.04

Example 3 Analysis of Synechocystis sp. PCC6803 Gene Expression Following UV-B Exposure

[0134] Example 3 describes the identification of the most highly UV-B responsive genes in Synechocystis sp. PCC6803 when grown under minimal media conditions and exposed to 20 minutes of UV-B irradiation at 20 μES⁻¹ m⁻² intensity. These UV inducible promoters can be used to control expression of certain proteins that may be toxic to Synechocystis cells. Microarrays and probes were prepared for UV-B induced and non-induced experiments essentially as described above using Synechocystis sp. PCC6803.

[0135] Specifically, a DNA microarray was prepared according to the methods described above using DNA isolated from Synechocystis sp. PCC6803. For each UV-B treatment experiment, two hybridization reactions were performed. In particular, the first reaction used equal molar (typically 100-200 pmol) Cy5-labeled cDNA made from total RNA isolated from the UV-B treated sample, and Cy3-labeled cDNA from total RNA isolated from the control sample (Synechocystis sp. PCC6803 grown in BG11 media containing 5 mM glucose). The second reaction used Cy3-labeled cDNA made from total RNA isolated from the UV-B treated sample, and Cy5-labeled cDNA made from total RNA isolated from the control sample. The signal intensities were quantitated as described above. To calculate the ratio of fold induction (i.e., UV-B/control), the UV-B treated sample signal intensities were divided by the signal intensities of the control sample. Since there were two sets of data from duplicate spotting within each slide, the total number of gene expression measurements for each gene was four. All four induction ratios for each gene were analyzed to determine the standard deviation, an indicator of the level of confidence for the specific data set for each gene.

[0136] Analysis of the date defined the most highly UV-B induced genes in Synechocystis following UV-B treatment (see Table 2). Only genes whose expression was induced more than 4 fold by UV-B light (20 min at 20 μES⁻¹ m⁻² intensity) as compared to the minimal media control are listed in Table 2.

[0137] In addition to genes of known function in the group of UV inducible genes, there are several genes of unknown function: slr1544, sll0528, ssll0846, slr1674, slr0320, and sr2016. The results tabulated in table 2 is the first level of functional assignment for these genes. The promoters of these genes can be used to construct UV inducible expression vectors in Synechocystis. TABLE 2 Most highly induced genes in Synechocystis sp. P006803 in BG11 media containing 5 mM glucose, with 20 min of UV-B treatment at 20 μES⁻¹ m⁻² intensity Systematic Data/ SEQ ID NO: Name Gene Function Control STD NA** AA*** ssr2595 hliB High light-inducible protein 22.7 4.7 7 8 slr544 ND* 15.5 7.6 9 10 sll0528 ND* 12.1 3.9 11 12 sll1514 hsp17 small heat shock protein 9.9 3.9 13 14 slr1687 nblB phycobilisome degradation protein NblB 8.2 1.9 15 16 sll1483 transforming growth factor induced protein 7.8 2.2 17 18 sll2012 rpoD RNA polymerase sigma factor 6.3 2.0 19 20 ssl1633 CAB/ELIP/HLIP superfamily 6.0 1.0 21 22 ssl2542 hliA high light-inducible protein 5.6 1.6 23 24 sll0846 ND* 4.7 0.9 25 26 slr1674 ND* 4.7 1.8 27 28 slr1604 ftsH Chloroplast associated protease FtsH 4.6 1.9 29 30 slr0320 ND* 4.5 2.2 31 32 sll0306 rpoD RNA polymerase sigma factor 4.4 1.0 33 34 slr0228 ftsH cell division protein FtsH 4.3 1.7 35 36 slr1641 clpB ClpB protein 4.3 1.1 37 38 ssr2016 ND* 4.2 2.2 39 40 sll1867 psbA3 photosystem II D1 protein 4.1 0.3

[0138]

1 40 1 1341 DNA Synechocystis sp. strain PCC6803 CDS (1)..(1341) 1 atg act aac cct ttt gga aga cgt aaa ttt ttg ctg tat gga tca gca 48 Met Thr Asn Pro Phe Gly Arg Arg Lys Phe Leu Leu Tyr Gly Ser Ala 1 5 10 15 acc cta ggc gcc agt cta ttg ctc aaa gcc tgt ggc ggc ggc acg gaa 96 Thr Leu Gly Ala Ser Leu Leu Leu Lys Ala Cys Gly Gly Gly Thr Glu 20 25 30 cct acc acc gaa ccc act gct gaa ccg act gag tcc ccc acc acc ggt 144 Pro Thr Thr Glu Pro Thr Ala Glu Pro Thr Glu Ser Pro Thr Thr Gly 35 40 45 act gct ccc acc ggg gaa ccg att aaa gtt ggt ttg ctc cac tcc ctc 192 Thr Ala Pro Thr Gly Glu Pro Ile Lys Val Gly Leu Leu His Ser Leu 50 55 60 agt ggc acc atg gcc atc agt gaa acc acc gtg gtg gaa gcg gcg gaa 240 Ser Gly Thr Met Ala Ile Ser Glu Thr Thr Val Val Glu Ala Ala Glu 65 70 75 80 ctg gcg atc gaa gag atc aat gcg gcc ggt gga gtt ttg ggt aga ccc 288 Leu Ala Ile Glu Glu Ile Asn Ala Ala Gly Gly Val Leu Gly Arg Pro 85 90 95 att gaa gcc atc aaa gaa gat gga gct tcc gat tgg ccg act ttt gcg 336 Ile Glu Ala Ile Lys Glu Asp Gly Ala Ser Asp Trp Pro Thr Phe Ala 100 105 110 gaa aaa gcg gct aag tta att gac caa gat aag gta ccc gta gtc ttt 384 Glu Lys Ala Ala Lys Leu Ile Asp Gln Asp Lys Val Pro Val Val Phe 115 120 125 ggt tgt tgg act tcc gcc agc cgg aaa gcg gta ctg ccg gta ttt gaa 432 Gly Cys Trp Thr Ser Ala Ser Arg Lys Ala Val Leu Pro Val Phe Glu 130 135 140 gcc aaa aat cat atg ctt tgg tac cca gta cag tac gaa ggt cag gaa 480 Ala Lys Asn His Met Leu Trp Tyr Pro Val Gln Tyr Glu Gly Gln Glu 145 150 155 160 tgt tcc aaa aac att ttc tac acc ggt gcc gcc ccc aac caa caa att 528 Cys Ser Lys Asn Ile Phe Tyr Thr Gly Ala Ala Pro Asn Gln Gln Ile 165 170 175 gaa ccg gcg gtg gat tgg ttg ctg gaa aat aaa ggc aat aag ttc ttc 576 Glu Pro Ala Val Asp Trp Leu Leu Glu Asn Lys Gly Asn Lys Phe Phe 180 185 190 ctg gtg ggt tcc gat tac gtt ttc ccc cgc act gct aac acc atc att 624 Leu Val Gly Ser Asp Tyr Val Phe Pro Arg Thr Ala Asn Thr Ile Ile 195 200 205 aaa gag cag ttg aaa gcc aaa ggt ggc gaa acc ctt ggg gaa gat tac 672 Lys Glu Gln Leu Lys Ala Lys Gly Gly Glu Thr Leu Gly Glu Asp Tyr 210 215 220 cta ccc ctg ggt aac acc gaa gtt acc cct atc atc acc aaa atc cgg 720 Leu Pro Leu Gly Asn Thr Glu Val Thr Pro Ile Ile Thr Lys Ile Arg 225 230 235 240 gaa gcc ctg ccc gat ggc ggc gta att ttc aac acc ctg aat ggg gac 768 Glu Ala Leu Pro Asp Gly Gly Val Ile Phe Asn Thr Leu Asn Gly Asp 245 250 255 agt aat gtt gcc ttc ttc aag cag atc caa gcc gct ggt ttg acc ccc 816 Ser Asn Val Ala Phe Phe Lys Gln Ile Gln Ala Ala Gly Leu Thr Pro 260 265 270 gat aaa tat ccg gtc atg tcc gtg agt gtg gcg gaa gag gaa gta cgt 864 Asp Lys Tyr Pro Val Met Ser Val Ser Val Ala Glu Glu Glu Val Arg 275 280 285 caa att ggt aag gag tat ctg ctc ggc cag ttt gct tct tgg aac tat 912 Gln Ile Gly Lys Glu Tyr Leu Leu Gly Gln Phe Ala Ser Trp Asn Tyr 290 295 300 ttc cag agt gtg gat acc cct gcc aac caa aaa ttt gtg gca gct ttt 960 Phe Gln Ser Val Asp Thr Pro Ala Asn Gln Lys Phe Val Ala Ala Phe 305 310 315 320 aag gct aaa tac ggt gaa gac cgg gtg acc aac gac ccc atg gaa gca 1008 Lys Ala Lys Tyr Gly Glu Asp Arg Val Thr Asn Asp Pro Met Glu Ala 325 330 335 gct tat att tcc gtt tac ctc tgg aag gcg gcg gtg gaa gcg gct gga 1056 Ala Tyr Ile Ser Val Tyr Leu Trp Lys Ala Ala Val Glu Ala Ala Gly 340 345 350 gat gtg ggt gaa act ccc gaa ggc tta gaa aaa gtc cgg gcg gcg gcg 1104 Asp Val Gly Glu Thr Pro Glu Gly Leu Glu Lys Val Arg Ala Ala Ala 355 360 365 att ggt aaa acc ttt gac gct ccg gaa ggc atg gtg acc atg caa ccc 1152 Ile Gly Lys Thr Phe Asp Ala Pro Glu Gly Met Val Thr Met Gln Pro 370 375 380 aac cac cac att tcc aaa act gtc cgc att ggg gaa gtc aat gac gaa 1200 Asn His His Ile Ser Lys Thr Val Arg Ile Gly Glu Val Asn Asp Glu 385 390 395 400 ggt cag ttc acc att gtt tgg tcc agt gat ggc ccc gtg gac ccc att 1248 Gly Gln Phe Thr Ile Val Trp Ser Ser Asp Gly Pro Val Asp Pro Ile 405 410 415 ccc tgg aac cag ttc gta ccg gaa acc aaa ggt ttc acc tgc gat tgg 1296 Pro Trp Asn Gln Phe Val Pro Glu Thr Lys Gly Phe Thr Cys Asp Trp 420 425 430 acc cgc aca gat gtg gaa aat cct ggt aag ttc aag gcc agc taa 1341 Thr Arg Thr Asp Val Glu Asn Pro Gly Lys Phe Lys Ala Ser 435 440 445 2 446 PRT Synechocystis sp. strain PCC6803 2 Met Thr Asn Pro Phe Gly Arg Arg Lys Phe Leu Leu Tyr Gly Ser Ala 1 5 10 15 Thr Leu Gly Ala Ser Leu Leu Leu Lys Ala Cys Gly Gly Gly Thr Glu 20 25 30 Pro Thr Thr Glu Pro Thr Ala Glu Pro Thr Glu Ser Pro Thr Thr Gly 35 40 45 Thr Ala Pro Thr Gly Glu Pro Ile Lys Val Gly Leu Leu His Ser Leu 50 55 60 Ser Gly Thr Met Ala Ile Ser Glu Thr Thr Val Val Glu Ala Ala Glu 65 70 75 80 Leu Ala Ile Glu Glu Ile Asn Ala Ala Gly Gly Val Leu Gly Arg Pro 85 90 95 Ile Glu Ala Ile Lys Glu Asp Gly Ala Ser Asp Trp Pro Thr Phe Ala 100 105 110 Glu Lys Ala Ala Lys Leu Ile Asp Gln Asp Lys Val Pro Val Val Phe 115 120 125 Gly Cys Trp Thr Ser Ala Ser Arg Lys Ala Val Leu Pro Val Phe Glu 130 135 140 Ala Lys Asn His Met Leu Trp Tyr Pro Val Gln Tyr Glu Gly Gln Glu 145 150 155 160 Cys Ser Lys Asn Ile Phe Tyr Thr Gly Ala Ala Pro Asn Gln Gln Ile 165 170 175 Glu Pro Ala Val Asp Trp Leu Leu Glu Asn Lys Gly Asn Lys Phe Phe 180 185 190 Leu Val Gly Ser Asp Tyr Val Phe Pro Arg Thr Ala Asn Thr Ile Ile 195 200 205 Lys Glu Gln Leu Lys Ala Lys Gly Gly Glu Thr Leu Gly Glu Asp Tyr 210 215 220 Leu Pro Leu Gly Asn Thr Glu Val Thr Pro Ile Ile Thr Lys Ile Arg 225 230 235 240 Glu Ala Leu Pro Asp Gly Gly Val Ile Phe Asn Thr Leu Asn Gly Asp 245 250 255 Ser Asn Val Ala Phe Phe Lys Gln Ile Gln Ala Ala Gly Leu Thr Pro 260 265 270 Asp Lys Tyr Pro Val Met Ser Val Ser Val Ala Glu Glu Glu Val Arg 275 280 285 Gln Ile Gly Lys Glu Tyr Leu Leu Gly Gln Phe Ala Ser Trp Asn Tyr 290 295 300 Phe Gln Ser Val Asp Thr Pro Ala Asn Gln Lys Phe Val Ala Ala Phe 305 310 315 320 Lys Ala Lys Tyr Gly Glu Asp Arg Val Thr Asn Asp Pro Met Glu Ala 325 330 335 Ala Tyr Ile Ser Val Tyr Leu Trp Lys Ala Ala Val Glu Ala Ala Gly 340 345 350 Asp Val Gly Glu Thr Pro Glu Gly Leu Glu Lys Val Arg Ala Ala Ala 355 360 365 Ile Gly Lys Thr Phe Asp Ala Pro Glu Gly Met Val Thr Met Gln Pro 370 375 380 Asn His His Ile Ser Lys Thr Val Arg Ile Gly Glu Val Asn Asp Glu 385 390 395 400 Gly Gln Phe Thr Ile Val Trp Ser Ser Asp Gly Pro Val Asp Pro Ile 405 410 415 Pro Trp Asn Gln Phe Val Pro Glu Thr Lys Gly Phe Thr Cys Asp Trp 420 425 430 Thr Arg Thr Asp Val Glu Asn Pro Gly Lys Phe Lys Ala Ser 435 440 445 3 417 DNA Synechocystis sp. strain PCC6803 CDS (1)..(417) 3 gtg ttc atg caa act aag cac ata gct cag gca aca gtg aaa gta ctg 48 Val Phe Met Gln Thr Lys His Ile Ala Gln Ala Thr Val Lys Val Leu 1 5 10 15 caa agt tac ctc acc tac caa gcc gtt ctc agg atc cag agt gaa ctc 96 Gln Ser Tyr Leu Thr Tyr Gln Ala Val Leu Arg Ile Gln Ser Glu Leu 20 25 30 ggg gaa acc aac cct ccc cag gcc att tgg tta aac cag tat tta gcc 144 Gly Glu Thr Asn Pro Pro Gln Ala Ile Trp Leu Asn Gln Tyr Leu Ala 35 40 45 agt cac agt att caa aat gga gaa acg ttt ttg acg gaa ctc ctg gat 192 Ser His Ser Ile Gln Asn Gly Glu Thr Phe Leu Thr Glu Leu Leu Asp 50 55 60 gaa aat aaa gaa ctg gta ctc agg atc ctg gcg gta agg gaa gac att 240 Glu Asn Lys Glu Leu Val Leu Arg Ile Leu Ala Val Arg Glu Asp Ile 65 70 75 80 gcc gaa tca gtg tta gat ttt ttg ccc ggt atg acc cgg aat agc tta 288 Ala Glu Ser Val Leu Asp Phe Leu Pro Gly Met Thr Arg Asn Ser Leu 85 90 95 gcg gaa tct aac atc gcc cac cgc cgc cat ttg ctt gaa cgt ctg acc 336 Ala Glu Ser Asn Ile Ala His Arg Arg His Leu Leu Glu Arg Leu Thr 100 105 110 cgt acc gta gcc gaa gtc gat aat ttc cct tcg gaa acc tcc aac gga 384 Arg Thr Val Ala Glu Val Asp Asn Phe Pro Ser Glu Thr Ser Asn Gly 115 120 125 gaa tca aac aac aac gat tct ccc ccg tcc taa 417 Glu Ser Asn Asn Asn Asp Ser Pro Pro Ser 130 135 4 138 PRT Synechocystis sp. strain PCC6803 4 Val Phe Met Gln Thr Lys His Ile Ala Gln Ala Thr Val Lys Val Leu 1 5 10 15 Gln Ser Tyr Leu Thr Tyr Gln Ala Val Leu Arg Ile Gln Ser Glu Leu 20 25 30 Gly Glu Thr Asn Pro Pro Gln Ala Ile Trp Leu Asn Gln Tyr Leu Ala 35 40 45 Ser His Ser Ile Gln Asn Gly Glu Thr Phe Leu Thr Glu Leu Leu Asp 50 55 60 Glu Asn Lys Glu Leu Val Leu Arg Ile Leu Ala Val Arg Glu Asp Ile 65 70 75 80 Ala Glu Ser Val Leu Asp Phe Leu Pro Gly Met Thr Arg Asn Ser Leu 85 90 95 Ala Glu Ser Asn Ile Ala His Arg Arg His Leu Leu Glu Arg Leu Thr 100 105 110 Arg Thr Val Ala Glu Val Asp Asn Phe Pro Ser Glu Thr Ser Asn Gly 115 120 125 Glu Ser Asn Asn Asn Asp Ser Pro Pro Ser 130 135 5 786 DNA Synechocystis sp. strain PCC6803 CDS (1)..(786) 5 atg cat tta gtt gat acc cat gtc cac att aac ttt gat gtt ttt gcg 48 Met His Leu Val Asp Thr His Val His Ile Asn Phe Asp Val Phe Ala 1 5 10 15 gcg gat tta gac cag tta cag cat cgc tgg cgg caa gct ggg gtg gtg 96 Ala Asp Leu Asp Gln Leu Gln His Arg Trp Arg Gln Ala Gly Val Val 20 25 30 caa ctg gtt cat tcc tgc gtt aag ccc cag gag ttt gat caa ata cag 144 Gln Leu Val His Ser Cys Val Lys Pro Gln Glu Phe Asp Gln Ile Gln 35 40 45 tct ctg gcg gac cgt ttt cct gaa cta ttt ttc gcc gtg gga ctc cat 192 Ser Leu Ala Asp Arg Phe Pro Glu Leu Phe Phe Ala Val Gly Leu His 50 55 60 cct ttg gat gcc gaa gat tgg caa gac aat act gct ggg caa atc ctt 240 Pro Leu Asp Ala Glu Asp Trp Gln Asp Asn Thr Ala Gly Gln Ile Leu 65 70 75 80 gcc tat gcc aag gcg gat gac cgg gtg gta gcc att ggt gaa atg ggt 288 Ala Tyr Ala Lys Ala Asp Asp Arg Val Val Ala Ile Gly Glu Met Gly 85 90 95 ttg gat ttt ttc aaa gcc gat aac cgt gac cat caa att gag gtt ttc 336 Leu Asp Phe Phe Lys Ala Asp Asn Arg Asp His Gln Ile Glu Val Phe 100 105 110 cgg gcc cag ttg gcg atc gcc agg gaa tta aac aag cca gtg att atc 384 Arg Ala Gln Leu Ala Ile Ala Arg Glu Leu Asn Lys Pro Val Ile Ile 115 120 125 cat tgt cgg gat gcc gcc cag acc atg cgc cag gta ttg act gat ttc 432 His Cys Arg Asp Ala Ala Gln Thr Met Arg Gln Val Leu Thr Asp Phe 130 135 140 caa gca gaa tcg ggg ccc gtg gct ggg gta atg cac tgt tgg ggt ggc 480 Gln Ala Glu Ser Gly Pro Val Ala Gly Val Met His Cys Trp Gly Gly 145 150 155 160 act cct gaa gaa acc caa tgg ttt ttg gac ctg ggg ttt tac atc agt 528 Thr Pro Glu Glu Thr Gln Trp Phe Leu Asp Leu Gly Phe Tyr Ile Ser 165 170 175 ttt agc ggc aca gtt acc ttc aaa aaa gct gaa ggg atc caa gcc agt 576 Phe Ser Gly Thr Val Thr Phe Lys Lys Ala Glu Gly Ile Gln Ala Ser 180 185 190 gcc cag atg gtc ccc ccc gat cgc ctg ttg gtg gaa acc gat tgt ccc 624 Ala Gln Met Val Pro Pro Asp Arg Leu Leu Val Glu Thr Asp Cys Pro 195 200 205 ttt ttg gcg cca gtg ccc caa cgg ggt aaa cgc aat gaa cca gcc ttt 672 Phe Leu Ala Pro Val Pro Gln Arg Gly Lys Arg Asn Glu Pro Ala Phe 210 215 220 gtc cgc cat gtg gcc gag gcg atc gct gcc ctg cgc cat gtc ccc cta 720 Val Arg His Val Ala Glu Ala Ile Ala Ala Leu Arg His Val Pro Leu 225 230 235 240 gaa acc ctt gcc caa caa acc act act aat gcc cgc aac ctt ttt aaa 768 Glu Thr Leu Ala Gln Gln Thr Thr Thr Asn Ala Arg Asn Leu Phe Lys 245 250 255 cta ccg gtg cct gcc taa 786 Leu Pro Val Pro Ala 260 6 261 PRT Synechocystis sp. strain PCC6803 6 Met His Leu Val Asp Thr His Val His Ile Asn Phe Asp Val Phe Ala 1 5 10 15 Ala Asp Leu Asp Gln Leu Gln His Arg Trp Arg Gln Ala Gly Val Val 20 25 30 Gln Leu Val His Ser Cys Val Lys Pro Gln Glu Phe Asp Gln Ile Gln 35 40 45 Ser Leu Ala Asp Arg Phe Pro Glu Leu Phe Phe Ala Val Gly Leu His 50 55 60 Pro Leu Asp Ala Glu Asp Trp Gln Asp Asn Thr Ala Gly Gln Ile Leu 65 70 75 80 Ala Tyr Ala Lys Ala Asp Asp Arg Val Val Ala Ile Gly Glu Met Gly 85 90 95 Leu Asp Phe Phe Lys Ala Asp Asn Arg Asp His Gln Ile Glu Val Phe 100 105 110 Arg Ala Gln Leu Ala Ile Ala Arg Glu Leu Asn Lys Pro Val Ile Ile 115 120 125 His Cys Arg Asp Ala Ala Gln Thr Met Arg Gln Val Leu Thr Asp Phe 130 135 140 Gln Ala Glu Ser Gly Pro Val Ala Gly Val Met His Cys Trp Gly Gly 145 150 155 160 Thr Pro Glu Glu Thr Gln Trp Phe Leu Asp Leu Gly Phe Tyr Ile Ser 165 170 175 Phe Ser Gly Thr Val Thr Phe Lys Lys Ala Glu Gly Ile Gln Ala Ser 180 185 190 Ala Gln Met Val Pro Pro Asp Arg Leu Leu Val Glu Thr Asp Cys Pro 195 200 205 Phe Leu Ala Pro Val Pro Gln Arg Gly Lys Arg Asn Glu Pro Ala Phe 210 215 220 Val Arg His Val Ala Glu Ala Ile Ala Ala Leu Arg His Val Pro Leu 225 230 235 240 Glu Thr Leu Ala Gln Gln Thr Thr Thr Asn Ala Arg Asn Leu Phe Lys 245 250 255 Leu Pro Val Pro Ala 260 7 213 DNA Synechocystis sp. strain PCC6803 CDS (1)..(213) 7 atg act agc cgc gga ttt cgc ctc gac caa gac aac cgt ctc aac aac 48 Met Thr Ser Arg Gly Phe Arg Leu Asp Gln Asp Asn Arg Leu Asn Asn 1 5 10 15 ttc gcc att gaa ccc cct gtg tac gtt gac agc agt gtt caa gcc ggt 96 Phe Ala Ile Glu Pro Pro Val Tyr Val Asp Ser Ser Val Gln Ala Gly 20 25 30 tgg act gaa tac gcc gaa aaa atg aat ggt cgt ttt gcc atg att ggc 144 Trp Thr Glu Tyr Ala Glu Lys Met Asn Gly Arg Phe Ala Met Ile Gly 35 40 45 ttt gtt tct ctc ttg gca atg gaa gta att act ggc cac ggc att gtg 192 Phe Val Ser Leu Leu Ala Met Glu Val Ile Thr Gly His Gly Ile Val 50 55 60 ggt tgg ttg ctc tct ctc taa 213 Gly Trp Leu Leu Ser Leu 65 70 8 70 PRT Synechocystis sp. strain PCC6803 8 Met Thr Ser Arg Gly Phe Arg Leu Asp Gln Asp Asn Arg Leu Asn Asn 1 5 10 15 Phe Ala Ile Glu Pro Pro Val Tyr Val Asp Ser Ser Val Gln Ala Gly 20 25 30 Trp Thr Glu Tyr Ala Glu Lys Met Asn Gly Arg Phe Ala Met Ile Gly 35 40 45 Phe Val Ser Leu Leu Ala Met Glu Val Ile Thr Gly His Gly Ile Val 50 55 60 Gly Trp Leu Leu Ser Leu 65 70 9 312 DNA Synechocystis sp. strain PCC6803 CDS (1)..(312) 9 atg aac tac caa agg act gcc ctt ggc acc gtg aaa atc gaa caa ata 48 Met Asn Tyr Gln Arg Thr Ala Leu Gly Thr Val Lys Ile Glu Gln Ile 1 5 10 15 aga ggt aaa act atg aac gcc gac act gat att tat caa aac aaa gat 96 Arg Gly Lys Thr Met Asn Ala Asp Thr Asp Ile Tyr Gln Asn Lys Asp 20 25 30 cta ttt gcc ccc gtt gtc ttc cgc aaa gac ttc aac caa ttt gcc ccc 144 Leu Phe Ala Pro Val Val Phe Arg Lys Asp Phe Asn Gln Phe Ala Pro 35 40 45 atc aac ggg aac caa gcc tgg tct tta ttt ttc acc gcc ggg caa gaa 192 Ile Asn Gly Asn Gln Ala Trp Ser Leu Phe Phe Thr Ala Gly Gln Glu 50 55 60 gat aag caa ctg ggc aac agc cct gaa ttc ggt cgc ttt ttc acc aat 240 Asp Lys Gln Leu Gly Asn Ser Pro Glu Phe Gly Arg Phe Phe Thr Asn 65 70 75 80 act ctc ttc gcc att ggg gct gcc act ttc atc tgg ggt tac ttc ttc 288 Thr Leu Phe Ala Ile Gly Ala Ala Thr Phe Ile Trp Gly Tyr Phe Phe 85 90 95 agc cgt tgg gct gac ttt ctc taa 312 Ser Arg Trp Ala Asp Phe Leu 100 10 103 PRT Synechocystis sp. strain PCC6803 10 Met Asn Tyr Gln Arg Thr Ala Leu Gly Thr Val Lys Ile Glu Gln Ile 1 5 10 15 Arg Gly Lys Thr Met Asn Ala Asp Thr Asp Ile Tyr Gln Asn Lys Asp 20 25 30 Leu Phe Ala Pro Val Val Phe Arg Lys Asp Phe Asn Gln Phe Ala Pro 35 40 45 Ile Asn Gly Asn Gln Ala Trp Ser Leu Phe Phe Thr Ala Gly Gln Glu 50 55 60 Asp Lys Gln Leu Gly Asn Ser Pro Glu Phe Gly Arg Phe Phe Thr Asn 65 70 75 80 Thr Leu Phe Ala Ile Gly Ala Ala Thr Phe Ile Trp Gly Tyr Phe Phe 85 90 95 Ser Arg Trp Ala Asp Phe Leu 100 11 1140 DNA Synechocystis sp. strain PCC6803 CDS (1)..(1140) 11 atg tta agc ctc agt tta ggg ggg cag ttt atg aac aac aat atc cgc 48 Met Leu Ser Leu Ser Leu Gly Gly Gln Phe Met Asn Asn Asn Ile Arg 1 5 10 15 gtc ggc agt ctg ttt ggc att cct ttt tac gtc aac cca tcc tgg ttt 96 Val Gly Ser Leu Phe Gly Ile Pro Phe Tyr Val Asn Pro Ser Trp Phe 20 25 30 tta att tta gga ttg gtg acc ctg agc tat ggc caa gac tta gcc cgc 144 Leu Ile Leu Gly Leu Val Thr Leu Ser Tyr Gly Gln Asp Leu Ala Arg 35 40 45 ttt ccc caa ctt tcc ggt ggc aca ccc tgg att ttg ggg tta att aca 192 Phe Pro Gln Leu Ser Gly Gly Thr Pro Trp Ile Leu Gly Leu Ile Thr 50 55 60 gct tta ctc ctc ttt gct tcc gtt gtc gcc cac gag ttg ggc cat agt 240 Ala Leu Leu Leu Phe Ala Ser Val Val Ala His Glu Leu Gly His Ser 65 70 75 80 ttg gtt gcc tta gcc cag ggc att gaa gtt aaa tcc atc act ctg ttt 288 Leu Val Ala Leu Ala Gln Gly Ile Glu Val Lys Ser Ile Thr Leu Phe 85 90 95 ttg ttc ggt ggt cta gcg agt tta gaa aag gaa tcc aac act ccc tgg 336 Leu Phe Gly Gly Leu Ala Ser Leu Glu Lys Glu Ser Asn Thr Pro Trp 100 105 110 caa gct ttt gcg gtg gcg atc gcc ggg ccg gcg gtg agt tta gtg ctc 384 Gln Ala Phe Ala Val Ala Ile Ala Gly Pro Ala Val Ser Leu Val Leu 115 120 125 ttt ttg ggt tta acc ata gtt ggt acc caa atc ccc cta cct gtg ccg 432 Phe Leu Gly Leu Thr Ile Val Gly Thr Gln Ile Pro Leu Pro Val Pro 130 135 140 ggg cag gcc atc att ggt tta ttg ggc atg atc aac ctc gcc ctg gca 480 Gly Gln Ala Ile Ile Gly Leu Leu Gly Met Ile Asn Leu Ala Leu Ala 145 150 155 160 ttg ttt aac ctc att cct ggt tta cct ttg gac ggc ggc aat gtg ctc 528 Leu Phe Asn Leu Ile Pro Gly Leu Pro Leu Asp Gly Gly Asn Val Leu 165 170 175 aaa tcc att gtg tgg caa atc acg ggc aat caa aac aaa ggt att ctc 576 Lys Ser Ile Val Trp Gln Ile Thr Gly Asn Gln Asn Lys Gly Ile Leu 180 185 190 att gct agt cgg gtg ggc cag ggt ttc ggt tgg ttg gcg atc gcc att 624 Ile Ala Ser Arg Val Gly Gln Gly Phe Gly Trp Leu Ala Ile Ala Ile 195 200 205 ggt agc tta ggt att tta aat att ctg ccc atc ggt agc ttc tgg acc 672 Gly Ser Leu Gly Ile Leu Asn Ile Leu Pro Ile Gly Ser Phe Trp Thr 210 215 220 att ttg atc ggt tgg ttc ctg tta caa aat gct ggt tcc tcc gcc cgc 720 Ile Leu Ile Gly Trp Phe Leu Leu Gln Asn Ala Gly Ser Ser Ala Arg 225 230 235 240 aac gcc cag gtc aaa gag caa atg gaa gcc ttt act gct gaa gat gcg 768 Asn Ala Gln Val Lys Glu Gln Met Glu Ala Phe Thr Ala Glu Asp Ala 245 250 255 gtt att ccc aac agc ccc att att cct gcc ggg tta aat att cgg gaa 816 Val Ile Pro Asn Ser Pro Ile Ile Pro Ala Gly Leu Asn Ile Arg Glu 260 265 270 ttt gct aac gat tat gtg att ggt aaa acc ccc tgg cga cgg ttc ttg 864 Phe Ala Asn Asp Tyr Val Ile Gly Lys Thr Pro Trp Arg Arg Phe Leu 275 280 285 gtt att ggt gcc gac aat caa ctg tta ggt gta ctt gct acg gaa gac 912 Val Ile Gly Ala Asp Asn Gln Leu Leu Gly Val Leu Ala Thr Glu Asp 290 295 300 atc aaa cac gtc ccc act tcc gat tgg ccc cag gtc aca gtg gat agc 960 Ile Lys His Val Pro Thr Ser Asp Trp Pro Gln Val Thr Val Asp Ser 305 310 315 320 ttg atg cag tat ccc caa cag atg gtc acc gtt aac gcc aat caa tct 1008 Leu Met Gln Tyr Pro Gln Gln Met Val Thr Val Asn Ala Asn Gln Ser 325 330 335 ttg ttt gaa gtg gcc cag ttg tta gat caa cag aaa ctg tcg gaa ctt 1056 Leu Phe Glu Val Ala Gln Leu Leu Asp Gln Gln Lys Leu Ser Glu Leu 340 345 350 ttg gtg gtg caa cct tcg gga gaa gtg gtg gga tta ttg gaa aaa gct 1104 Leu Val Val Gln Pro Ser Gly Glu Val Val Gly Leu Leu Glu Lys Ala 355 360 365 tcc atc atc aaa tgt ctg caa acc tcc gcc gcc tag 1140 Ser Ile Ile Lys Cys Leu Gln Thr Ser Ala Ala 370 375 12 379 PRT Synechocystis sp. strain PCC6803 12 Met Leu Ser Leu Ser Leu Gly Gly Gln Phe Met Asn Asn Asn Ile Arg 1 5 10 15 Val Gly Ser Leu Phe Gly Ile Pro Phe Tyr Val Asn Pro Ser Trp Phe 20 25 30 Leu Ile Leu Gly Leu Val Thr Leu Ser Tyr Gly Gln Asp Leu Ala Arg 35 40 45 Phe Pro Gln Leu Ser Gly Gly Thr Pro Trp Ile Leu Gly Leu Ile Thr 50 55 60 Ala Leu Leu Leu Phe Ala Ser Val Val Ala His Glu Leu Gly His Ser 65 70 75 80 Leu Val Ala Leu Ala Gln Gly Ile Glu Val Lys Ser Ile Thr Leu Phe 85 90 95 Leu Phe Gly Gly Leu Ala Ser Leu Glu Lys Glu Ser Asn Thr Pro Trp 100 105 110 Gln Ala Phe Ala Val Ala Ile Ala Gly Pro Ala Val Ser Leu Val Leu 115 120 125 Phe Leu Gly Leu Thr Ile Val Gly Thr Gln Ile Pro Leu Pro Val Pro 130 135 140 Gly Gln Ala Ile Ile Gly Leu Leu Gly Met Ile Asn Leu Ala Leu Ala 145 150 155 160 Leu Phe Asn Leu Ile Pro Gly Leu Pro Leu Asp Gly Gly Asn Val Leu 165 170 175 Lys Ser Ile Val Trp Gln Ile Thr Gly Asn Gln Asn Lys Gly Ile Leu 180 185 190 Ile Ala Ser Arg Val Gly Gln Gly Phe Gly Trp Leu Ala Ile Ala Ile 195 200 205 Gly Ser Leu Gly Ile Leu Asn Ile Leu Pro Ile Gly Ser Phe Trp Thr 210 215 220 Ile Leu Ile Gly Trp Phe Leu Leu Gln Asn Ala Gly Ser Ser Ala Arg 225 230 235 240 Asn Ala Gln Val Lys Glu Gln Met Glu Ala Phe Thr Ala Glu Asp Ala 245 250 255 Val Ile Pro Asn Ser Pro Ile Ile Pro Ala Gly Leu Asn Ile Arg Glu 260 265 270 Phe Ala Asn Asp Tyr Val Ile Gly Lys Thr Pro Trp Arg Arg Phe Leu 275 280 285 Val Ile Gly Ala Asp Asn Gln Leu Leu Gly Val Leu Ala Thr Glu Asp 290 295 300 Ile Lys His Val Pro Thr Ser Asp Trp Pro Gln Val Thr Val Asp Ser 305 310 315 320 Leu Met Gln Tyr Pro Gln Gln Met Val Thr Val Asn Ala Asn Gln Ser 325 330 335 Leu Phe Glu Val Ala Gln Leu Leu Asp Gln Gln Lys Leu Ser Glu Leu 340 345 350 Leu Val Val Gln Pro Ser Gly Glu Val Val Gly Leu Leu Glu Lys Ala 355 360 365 Ser Ile Ile Lys Cys Leu Gln Thr Ser Ala Ala 370 375 13 441 DNA Synechocystis sp. strain PCC6803 CDS (1)..(441) 13 atg tct ctc att ctt tac aat ccc ctg cgg gaa atg gat aat ttc cag 48 Met Ser Leu Ile Leu Tyr Asn Pro Leu Arg Glu Met Asp Asn Phe Gln 1 5 10 15 cag cag atg aac caa ctg ttt gaa gaa gtt ttt gtc cct acg gac cgc 96 Gln Gln Met Asn Gln Leu Phe Glu Glu Val Phe Val Pro Thr Asp Arg 20 25 30 cac ggc gat cgc caa ggg ttt aat cct aaa gca gaa cta act gaa act 144 His Gly Asp Arg Gln Gly Phe Asn Pro Lys Ala Glu Leu Thr Glu Thr 35 40 45 gaa gaa gcc tat gtg ctc aaa cta gaa tta cct ggc atg gac ccc gat 192 Glu Glu Ala Tyr Val Leu Lys Leu Glu Leu Pro Gly Met Asp Pro Asp 50 55 60 aat ttg gac atc caa gcc gcc agg gat gcg gtg acc gtc agc ggc gat 240 Asn Leu Asp Ile Gln Ala Ala Arg Asp Ala Val Thr Val Ser Gly Asp 65 70 75 80 cgc cag gat acc cat agc acc gaa aaa gat ggg gtg cgg cgc aca gag 288 Arg Gln Asp Thr His Ser Thr Glu Lys Asp Gly Val Arg Arg Thr Glu 85 90 95 ttc cgc tat ggc agt ttc cgc cgg gtt att cct gta cct gga gca atc 336 Phe Arg Tyr Gly Ser Phe Arg Arg Val Ile Pro Val Pro Gly Ala Ile 100 105 110 caa aac aca gaa gtt aaa gct aat tac gat gcc ggt atc cta act ttg 384 Gln Asn Thr Glu Val Lys Ala Asn Tyr Asp Ala Gly Ile Leu Thr Leu 115 120 125 act ttg ccc aaa gta gag gaa gcc aaa aat aaa gtg gtg aaa gtt cag 432 Thr Leu Pro Lys Val Glu Glu Ala Lys Asn Lys Val Val Lys Val Gln 130 135 140 ctt tcc taa 441 Leu Ser 145 14 146 PRT Synechocystis sp. strain PCC6803 14 Met Ser Leu Ile Leu Tyr Asn Pro Leu Arg Glu Met Asp Asn Phe Gln 1 5 10 15 Gln Gln Met Asn Gln Leu Phe Glu Glu Val Phe Val Pro Thr Asp Arg 20 25 30 His Gly Asp Arg Gln Gly Phe Asn Pro Lys Ala Glu Leu Thr Glu Thr 35 40 45 Glu Glu Ala Tyr Val Leu Lys Leu Glu Leu Pro Gly Met Asp Pro Asp 50 55 60 Asn Leu Asp Ile Gln Ala Ala Arg Asp Ala Val Thr Val Ser Gly Asp 65 70 75 80 Arg Gln Asp Thr His Ser Thr Glu Lys Asp Gly Val Arg Arg Thr Glu 85 90 95 Phe Arg Tyr Gly Ser Phe Arg Arg Val Ile Pro Val Pro Gly Ala Ile 100 105 110 Gln Asn Thr Glu Val Lys Ala Asn Tyr Asp Ala Gly Ile Leu Thr Leu 115 120 125 Thr Leu Pro Lys Val Glu Glu Ala Lys Asn Lys Val Val Lys Val Gln 130 135 140 Leu Ser 145 15 702 DNA Synechocystis sp. strain PCC6803 CDS (1)..(702) 15 atg gca gaa gaa att ctc aga aac cca gcc atg aca gcc ctg acc ctc 48 Met Ala Glu Glu Ile Leu Arg Asn Pro Ala Met Thr Ala Leu Thr Leu 1 5 10 15 gaa caa att gcc agc caa ctc gac agc ccc aat tcc cgc gat cgc ctg 96 Glu Gln Ile Ala Ser Gln Leu Asp Ser Pro Asn Ser Arg Asp Arg Leu 20 25 30 att gcc cta gct tcc ctg aga ccc tat tcc agt gag gag gcg gtg ccc 144 Ile Ala Leu Ala Ser Leu Arg Pro Tyr Ser Ser Glu Glu Ala Val Pro 35 40 45 ctg att aaa aaa gtt tta gat gac gat act tta cag gtg cgt tcc atg 192 Leu Ile Lys Lys Val Leu Asp Asp Asp Thr Leu Gln Val Arg Ser Met 50 55 60 gcg gtg ttt gcc ctg ggc att aag caa acc gag gaa tgc tat ccc att 240 Ala Val Phe Ala Leu Gly Ile Lys Gln Thr Glu Glu Cys Tyr Pro Ile 65 70 75 80 ctg gtt aag ctg ttg gaa acc gat gga gac tat ggc atc cgg gcc gat 288 Leu Val Lys Leu Leu Glu Thr Asp Gly Asp Tyr Gly Ile Arg Ala Asp 85 90 95 gcc gcg ggg gcc ctg ggt tat cta gaa gac gaa cgg gct ttc cat ccc 336 Ala Ala Gly Ala Leu Gly Tyr Leu Glu Asp Glu Arg Ala Phe His Pro 100 105 110 ctc tgc cgg gct ttt tac gaa gat acg gaa tgg ctg gtg cgg ttc agt 384 Leu Cys Arg Ala Phe Tyr Glu Asp Thr Glu Trp Leu Val Arg Phe Ser 115 120 125 gcg gcg gtg gcc ctg ggc aat tta aaa gat att cgg gct caa acg gtc 432 Ala Ala Val Ala Leu Gly Asn Leu Lys Asp Ile Arg Ala Gln Thr Val 130 135 140 ttg ctg gaa gca ctg aaa agt gac gaa gca gtg gta caa caa gcg gcg 480 Leu Leu Glu Ala Leu Lys Ser Asp Glu Ala Val Val Gln Gln Ala Ala 145 150 155 160 atc gcg gcc ctg ggg gaa att ggt gcc gtg gat gca gta gat gcg att 528 Ile Ala Ala Leu Gly Glu Ile Gly Ala Val Asp Ala Val Asp Ala Ile 165 170 175 ttg gcc ttt gca tcc cat gag gac tgg tta att cgc caa aga tta gtg 576 Leu Ala Phe Ala Ser His Glu Asp Trp Leu Ile Arg Gln Arg Leu Val 180 185 190 gag gcc ctg gga aat ttg ccc tgc gac cag agt cgt tct gct ttg act 624 Glu Ala Leu Gly Asn Leu Pro Cys Asp Gln Ser Arg Ser Ala Leu Thr 195 200 205 ttc atg gtc aag gat gag cac ccc cag gtg tcc cag gcg gcc cag ttg 672 Phe Met Val Lys Asp Glu His Pro Gln Val Ser Gln Ala Ala Gln Leu 210 215 220 tcc ttg caa aaa tta gac ctg ctt agc tag 702 Ser Leu Gln Lys Leu Asp Leu Leu Ser 225 230 16 233 PRT Synechocystis sp. strain PCC6803 16 Met Ala Glu Glu Ile Leu Arg Asn Pro Ala Met Thr Ala Leu Thr Leu 1 5 10 15 Glu Gln Ile Ala Ser Gln Leu Asp Ser Pro Asn Ser Arg Asp Arg Leu 20 25 30 Ile Ala Leu Ala Ser Leu Arg Pro Tyr Ser Ser Glu Glu Ala Val Pro 35 40 45 Leu Ile Lys Lys Val Leu Asp Asp Asp Thr Leu Gln Val Arg Ser Met 50 55 60 Ala Val Phe Ala Leu Gly Ile Lys Gln Thr Glu Glu Cys Tyr Pro Ile 65 70 75 80 Leu Val Lys Leu Leu Glu Thr Asp Gly Asp Tyr Gly Ile Arg Ala Asp 85 90 95 Ala Ala Gly Ala Leu Gly Tyr Leu Glu Asp Glu Arg Ala Phe His Pro 100 105 110 Leu Cys Arg Ala Phe Tyr Glu Asp Thr Glu Trp Leu Val Arg Phe Ser 115 120 125 Ala Ala Val Ala Leu Gly Asn Leu Lys Asp Ile Arg Ala Gln Thr Val 130 135 140 Leu Leu Glu Ala Leu Lys Ser Asp Glu Ala Val Val Gln Gln Ala Ala 145 150 155 160 Ile Ala Ala Leu Gly Glu Ile Gly Ala Val Asp Ala Val Asp Ala Ile 165 170 175 Leu Ala Phe Ala Ser His Glu Asp Trp Leu Ile Arg Gln Arg Leu Val 180 185 190 Glu Ala Leu Gly Asn Leu Pro Cys Asp Gln Ser Arg Ser Ala Leu Thr 195 200 205 Phe Met Val Lys Asp Glu His Pro Gln Val Ser Gln Ala Ala Gln Leu 210 215 220 Ser Leu Gln Lys Leu Asp Leu Leu Ser 225 230 17 543 DNA Synechocystis sp. strain PCC6803 CDS (1)..(543) 17 atg aaa acc gct gct aga att gtt gct ttt acc gct ctg act gga ttt 48 Met Lys Thr Ala Ala Arg Ile Val Ala Phe Thr Ala Leu Thr Gly Phe 1 5 10 15 gcc ctg ggg atg ccc acc gtt gcc atg gcg gaa atg gaa acc acc gaa 96 Ala Leu Gly Met Pro Thr Val Ala Met Ala Glu Met Glu Thr Thr Glu 20 25 30 aaa tct gcc gta gtt agt caa gcc gcc acg gac agc gcc atg act att 144 Lys Ser Ala Val Val Ser Gln Ala Ala Thr Asp Ser Ala Met Thr Ile 35 40 45 gtg gaa gtc gcc gca ggc aat gaa act ttc agt acc ctc gtt gca gca 192 Val Glu Val Ala Ala Gly Asn Glu Thr Phe Ser Thr Leu Val Ala Ala 50 55 60 gtc aaa gcg gct gat tta gtg gaa gct tta tcc gct gaa ggc ccc ttt 240 Val Lys Ala Ala Asp Leu Val Glu Ala Leu Ser Ala Glu Gly Pro Phe 65 70 75 80 acc gtt ttt gcc ccc acc aat gat gcc ttt gcc gct ctg ccc gct ggt 288 Thr Val Phe Ala Pro Thr Asn Asp Ala Phe Ala Ala Leu Pro Ala Gly 85 90 95 acg gtg gaa agt ctg ttg ttg ccc gaa aac aaa gat aaa ttg gtg aaa 336 Thr Val Glu Ser Leu Leu Leu Pro Glu Asn Lys Asp Lys Leu Val Lys 100 105 110 att ttg acc tac cac gtc gtt cct ggc aaa atc acc gcc gcc cag gtt 384 Ile Leu Thr Tyr His Val Val Pro Gly Lys Ile Thr Ala Ala Gln Val 115 120 125 caa tcc ggt gaa gtg gca tcc cta gct ggg gaa gcc ctc acc ttc aaa 432 Gln Ser Gly Glu Val Ala Ser Leu Ala Gly Glu Ala Leu Thr Phe Lys 130 135 140 gtc aaa gat ggc aaa gtg aaa gtt aac aaa gcc act gtc att tcc gcc 480 Val Lys Asp Gly Lys Val Lys Val Asn Lys Ala Thr Val Ile Ser Ala 145 150 155 160 gat gtg gat gcc agc aac ggt gta atc cat gtc att gac caa gta att 528 Asp Val Asp Ala Ser Asn Gly Val Ile His Val Ile Asp Gln Val Ile 165 170 175 ctg cct cct atg taa 543 Leu Pro Pro Met 180 18 180 PRT Synechocystis sp. strain PCC6803 18 Met Lys Thr Ala Ala Arg Ile Val Ala Phe Thr Ala Leu Thr Gly Phe 1 5 10 15 Ala Leu Gly Met Pro Thr Val Ala Met Ala Glu Met Glu Thr Thr Glu 20 25 30 Lys Ser Ala Val Val Ser Gln Ala Ala Thr Asp Ser Ala Met Thr Ile 35 40 45 Val Glu Val Ala Ala Gly Asn Glu Thr Phe Ser Thr Leu Val Ala Ala 50 55 60 Val Lys Ala Ala Asp Leu Val Glu Ala Leu Ser Ala Glu Gly Pro Phe 65 70 75 80 Thr Val Phe Ala Pro Thr Asn Asp Ala Phe Ala Ala Leu Pro Ala Gly 85 90 95 Thr Val Glu Ser Leu Leu Leu Pro Glu Asn Lys Asp Lys Leu Val Lys 100 105 110 Ile Leu Thr Tyr His Val Val Pro Gly Lys Ile Thr Ala Ala Gln Val 115 120 125 Gln Ser Gly Glu Val Ala Ser Leu Ala Gly Glu Ala Leu Thr Phe Lys 130 135 140 Val Lys Asp Gly Lys Val Lys Val Asn Lys Ala Thr Val Ile Ser Ala 145 150 155 160 Asp Val Asp Ala Ser Asn Gly Val Ile His Val Ile Asp Gln Val Ile 165 170 175 Leu Pro Pro Met 180 19 957 DNA Synechocystis sp. strain PCC6803 CDS (1)..(957) 19 atg act gcc aga acc agc ccc gat tcc gtc cgt gcc tat ctc aga gaa 48 Met Thr Ala Arg Thr Ser Pro Asp Ser Val Arg Ala Tyr Leu Arg Glu 1 5 10 15 att ggt cgt gtg ccc ctg ctc acc cat gag gaa gag att gtt tat gct 96 Ile Gly Arg Val Pro Leu Leu Thr His Glu Glu Glu Ile Val Tyr Ala 20 25 30 aag caa atc caa cag gtt gtt agc ctc aac gaa atc aag aag tct ttg 144 Lys Gln Ile Gln Gln Val Val Ser Leu Asn Glu Ile Lys Lys Ser Leu 35 40 45 gcc gaa ggc aag gat ggc gag ccg gtt tcc ccc agc gag tgg gct aag 192 Ala Glu Gly Lys Asp Gly Glu Pro Val Ser Pro Ser Glu Trp Ala Lys 50 55 60 gcg gcc gat ttg tcc att cga gaa tta gaa aaa gcc atc aag gaa ggg 240 Ala Ala Asp Leu Ser Ile Arg Glu Leu Glu Lys Ala Ile Lys Glu Gly 65 70 75 80 gaa cgg gcc aag cgc aaa atg gtg gag gct aac ctc cgg ctg gtg gta 288 Glu Arg Ala Lys Arg Lys Met Val Glu Ala Asn Leu Arg Leu Val Val 85 90 95 tct gtc gcc aaa aaa tat ctc aag cgt aat cta gac cta ctt gac ctc 336 Ser Val Ala Lys Lys Tyr Leu Lys Arg Asn Leu Asp Leu Leu Asp Leu 100 105 110 atc caa gag ggc acc att ggt atg caa cgg ggg gta gag aag ttt gac 384 Ile Gln Glu Gly Thr Ile Gly Met Gln Arg Gly Val Glu Lys Phe Asp 115 120 125 ccc acc aag ggt tat cgg ttt tcc acc tat gcc tat tgg tgg atc cgc 432 Pro Thr Lys Gly Tyr Arg Phe Ser Thr Tyr Ala Tyr Trp Trp Ile Arg 130 135 140 cag gcc atc acc agg gcg atc gcc gaa aag agc cgc acc atc cgt tta 480 Gln Ala Ile Thr Arg Ala Ile Ala Glu Lys Ser Arg Thr Ile Arg Leu 145 150 155 160 cca atc cac att acg gaa aag tta aac aaa att aaa aaa gcc caa aga 528 Pro Ile His Ile Thr Glu Lys Leu Asn Lys Ile Lys Lys Ala Gln Arg 165 170 175 caa ctt tcc cag gaa aag ggt cgg gcc gct tcc att gcg gaa ttg gcg 576 Gln Leu Ser Gln Glu Lys Gly Arg Ala Ala Ser Ile Ala Glu Leu Ala 180 185 190 gaa cat cta gaa tta act ccc aag caa gtg cgg gaa tat ttg gag cgc 624 Glu His Leu Glu Leu Thr Pro Lys Gln Val Arg Glu Tyr Leu Glu Arg 195 200 205 tct cgc cat ccc ctt tcc ttg gat tta cgg gtg ggg gac aac caa gat 672 Ser Arg His Pro Leu Ser Leu Asp Leu Arg Val Gly Asp Asn Gln Asp 210 215 220 act gag tta ggg gat ttg ttg gaa gac gac ggt cct tta cca gag gat 720 Thr Glu Leu Gly Asp Leu Leu Glu Asp Asp Gly Pro Leu Pro Glu Asp 225 230 235 240 ttt gcc acc tat gcc tcc cta cag ttg gat ctc gat agc ctg atg gcg 768 Phe Ala Thr Tyr Ala Ser Leu Gln Leu Asp Leu Asp Ser Leu Met Ala 245 250 255 gaa tta acg ccc caa caa cgg gaa gtt ctc att ctc cgc ttt ggc ctc 816 Glu Leu Thr Pro Gln Gln Arg Glu Val Leu Ile Leu Arg Phe Gly Leu 260 265 270 aat gat ggc caa ccc cta acc ttg gcg agc att ggc tcc atg ctc agc 864 Asn Asp Gly Gln Pro Leu Thr Leu Ala Ser Ile Gly Ser Met Leu Ser 275 280 285 atc agt cgg gaa cgg gtg cgg cag att gag cgg gaa gcc cta aat aaa 912 Ile Ser Arg Glu Arg Val Arg Gln Ile Glu Arg Glu Ala Leu Asn Lys 290 295 300 tta cgc aaa cgc aag tcc atg atc cag gaa tat tta gct agc taa 957 Leu Arg Lys Arg Lys Ser Met Ile Gln Glu Tyr Leu Ala Ser 305 310 315 20 318 PRT Synechocystis sp. strain PCC6803 20 Met Thr Ala Arg Thr Ser Pro Asp Ser Val Arg Ala Tyr Leu Arg Glu 1 5 10 15 Ile Gly Arg Val Pro Leu Leu Thr His Glu Glu Glu Ile Val Tyr Ala 20 25 30 Lys Gln Ile Gln Gln Val Val Ser Leu Asn Glu Ile Lys Lys Ser Leu 35 40 45 Ala Glu Gly Lys Asp Gly Glu Pro Val Ser Pro Ser Glu Trp Ala Lys 50 55 60 Ala Ala Asp Leu Ser Ile Arg Glu Leu Glu Lys Ala Ile Lys Glu Gly 65 70 75 80 Glu Arg Ala Lys Arg Lys Met Val Glu Ala Asn Leu Arg Leu Val Val 85 90 95 Ser Val Ala Lys Lys Tyr Leu Lys Arg Asn Leu Asp Leu Leu Asp Leu 100 105 110 Ile Gln Glu Gly Thr Ile Gly Met Gln Arg Gly Val Glu Lys Phe Asp 115 120 125 Pro Thr Lys Gly Tyr Arg Phe Ser Thr Tyr Ala Tyr Trp Trp Ile Arg 130 135 140 Gln Ala Ile Thr Arg Ala Ile Ala Glu Lys Ser Arg Thr Ile Arg Leu 145 150 155 160 Pro Ile His Ile Thr Glu Lys Leu Asn Lys Ile Lys Lys Ala Gln Arg 165 170 175 Gln Leu Ser Gln Glu Lys Gly Arg Ala Ala Ser Ile Ala Glu Leu Ala 180 185 190 Glu His Leu Glu Leu Thr Pro Lys Gln Val Arg Glu Tyr Leu Glu Arg 195 200 205 Ser Arg His Pro Leu Ser Leu Asp Leu Arg Val Gly Asp Asn Gln Asp 210 215 220 Thr Glu Leu Gly Asp Leu Leu Glu Asp Asp Gly Pro Leu Pro Glu Asp 225 230 235 240 Phe Ala Thr Tyr Ala Ser Leu Gln Leu Asp Leu Asp Ser Leu Met Ala 245 250 255 Glu Leu Thr Pro Gln Gln Arg Glu Val Leu Ile Leu Arg Phe Gly Leu 260 265 270 Asn Asp Gly Gln Pro Leu Thr Leu Ala Ser Ile Gly Ser Met Leu Ser 275 280 285 Ile Ser Arg Glu Arg Val Arg Gln Ile Glu Arg Glu Ala Leu Asn Lys 290 295 300 Leu Arg Lys Arg Lys Ser Met Ile Gln Glu Tyr Leu Ala Ser 305 310 315 21 213 DNA Synechocystis sp. strain PCC6803 CDS (1)..(213) 21 atg ggc gca ata ctc tgt tac att tat tta cat aga caa ccc tcc cag 48 Met Gly Ala Ile Leu Cys Tyr Ile Tyr Leu His Arg Gln Pro Ser Gln 1 5 10 15 ctc gta att aca ttc tta acc atg aac aac gaa aac tct aaa ttt gga 96 Leu Val Ile Thr Phe Leu Thr Met Asn Asn Glu Asn Ser Lys Phe Gly 20 25 30 ttc act gct ttc gcc gaa aac tgg aat ggt cgc ttg gcc atg atc ggt 144 Phe Thr Ala Phe Ala Glu Asn Trp Asn Gly Arg Leu Ala Met Ile Gly 35 40 45 ttt tcc tct gcc ctg atc ctc gag ctt gtc tct ggg caa ggt gta ctt 192 Phe Ser Ser Ala Leu Ile Leu Glu Leu Val Ser Gly Gln Gly Val Leu 50 55 60 cac ttc ttc ggc att ctg taa 213 His Phe Phe Gly Ile Leu 65 70 22 70 PRT Synechocystis sp. strain PCC6803 22 Met Gly Ala Ile Leu Cys Tyr Ile Tyr Leu His Arg Gln Pro Ser Gln 1 5 10 15 Leu Val Ile Thr Phe Leu Thr Met Asn Asn Glu Asn Ser Lys Phe Gly 20 25 30 Phe Thr Ala Phe Ala Glu Asn Trp Asn Gly Arg Leu Ala Met Ile Gly 35 40 45 Phe Ser Ser Ala Leu Ile Leu Glu Leu Val Ser Gly Gln Gly Val Leu 50 55 60 His Phe Phe Gly Ile Leu 65 70 23 213 DNA Synechocystis sp. strain PCC6803 CDS (1)..(213) 23 atg acc acc cgt ggc ttc cgc ttg gat cag gac aac cgt ctc aac aac 48 Met Thr Thr Arg Gly Phe Arg Leu Asp Gln Asp Asn Arg Leu Asn Asn 1 5 10 15 ttt gcc atc gaa cca gag gtt tac gtc gac tct tcc gta caa gcg ggt 96 Phe Ala Ile Glu Pro Glu Val Tyr Val Asp Ser Ser Val Gln Ala Gly 20 25 30 tgg act aaa tac gcc gaa aaa atg aat ggt cgt ttc gcc atg att ggt 144 Trp Thr Lys Tyr Ala Glu Lys Met Asn Gly Arg Phe Ala Met Ile Gly 35 40 45 ttt gcc tcc ctc ctt att atg gaa gtg gtc aca ggg cac ggc gtc att 192 Phe Ala Ser Leu Leu Ile Met Glu Val Val Thr Gly His Gly Val Ile 50 55 60 ggt tgg tta aat agc ctg tag 213 Gly Trp Leu Asn Ser Leu 65 70 24 70 PRT Synechocystis sp. strain PCC6803 24 Met Thr Thr Arg Gly Phe Arg Leu Asp Gln Asp Asn Arg Leu Asn Asn 1 5 10 15 Phe Ala Ile Glu Pro Glu Val Tyr Val Asp Ser Ser Val Gln Ala Gly 20 25 30 Trp Thr Lys Tyr Ala Glu Lys Met Asn Gly Arg Phe Ala Met Ile Gly 35 40 45 Phe Ala Ser Leu Leu Ile Met Glu Val Val Thr Gly His Gly Val Ile 50 55 60 Gly Trp Leu Asn Ser Leu 65 70 25 309 DNA Synechocystis sp. strain PCC6803 CDS (1)..(309) 25 atg aaa ttt tcc ctc gag tct ctc tat ggt tgg tac cgt caa atg ctg 48 Met Lys Phe Ser Leu Glu Ser Leu Tyr Gly Trp Tyr Arg Gln Met Leu 1 5 10 15 aac cat ccc cgg tac cgt tgg tgg att gtc ctc ggc tcc ttg gtg tat 96 Asn His Pro Arg Tyr Arg Trp Trp Ile Val Leu Gly Ser Leu Val Tyr 20 25 30 ctc ctc agt ccc atc gat ttt ctg ccc gac gtt ttc ccc gta ctt ggt 144 Leu Leu Ser Pro Ile Asp Phe Leu Pro Asp Val Phe Pro Val Leu Gly 35 40 45 tgg att gac gat ggt tta att gcc act ttg ctg gta tcg gaa att tcc 192 Trp Ile Asp Asp Gly Leu Ile Ala Thr Leu Leu Val Ser Glu Ile Ser 50 55 60 caa atg gtt ctc act ggc tta aaa aac aag aca acc aag cag gaa aag 240 Gln Met Val Leu Thr Gly Leu Lys Asn Lys Thr Thr Lys Gln Glu Lys 65 70 75 80 gat gcc ccc cag gaa acc gtg gtg gtg gat gtg gtg gat gtg gtg gga 288 Asp Ala Pro Gln Glu Thr Val Val Val Asp Val Val Asp Val Val Gly 85 90 95 cag gac gtg gcc cac agt taa 309 Gln Asp Val Ala His Ser 100 26 102 PRT Synechocystis sp. strain PCC6803 26 Met Lys Phe Ser Leu Glu Ser Leu Tyr Gly Trp Tyr Arg Gln Met Leu 1 5 10 15 Asn His Pro Arg Tyr Arg Trp Trp Ile Val Leu Gly Ser Leu Val Tyr 20 25 30 Leu Leu Ser Pro Ile Asp Phe Leu Pro Asp Val Phe Pro Val Leu Gly 35 40 45 Trp Ile Asp Asp Gly Leu Ile Ala Thr Leu Leu Val Ser Glu Ile Ser 50 55 60 Gln Met Val Leu Thr Gly Leu Lys Asn Lys Thr Thr Lys Gln Glu Lys 65 70 75 80 Asp Ala Pro Gln Glu Thr Val Val Val Asp Val Val Asp Val Val Gly 85 90 95 Gln Asp Val Ala His Ser 100 27 351 DNA Synechocystis sp. strain PCC6803 CDS (1)..(351) 27 gtg acc cat gaa ccc caa cgt ccc caa ccg tta ttc gct ggc aat gaa 48 Val Thr His Glu Pro Gln Arg Pro Gln Pro Leu Phe Ala Gly Asn Glu 1 5 10 15 gcc cca ggc aaa gat agt ttg tgg aca tac gtt caa gaa tta agc ccc 96 Ala Pro Gly Lys Asp Ser Leu Trp Thr Tyr Val Gln Glu Leu Ser Pro 20 25 30 gaa acc att gcc caa tta tct cgc ccc gat tcc cag gaa gtg ttt cag 144 Glu Thr Ile Ala Gln Leu Ser Arg Pro Asp Ser Gln Glu Val Phe Gln 35 40 45 gtg atg gag cgc aac att atc ggt ctg ttg gga aat tta ccc ccg gag 192 Val Met Glu Arg Asn Ile Ile Gly Leu Leu Gly Asn Leu Pro Pro Glu 50 55 60 cac ttt ggg gta acc atc agc act agc cgg gaa aat ttg ggc cgt ctt 240 His Phe Gly Val Thr Ile Ser Thr Ser Arg Glu Asn Leu Gly Arg Leu 65 70 75 80 tta gcc tcc gcc atg atg agt ggc tat ttt ctt cgc aac gcc gag caa 288 Leu Ala Ser Ala Met Met Ser Gly Tyr Phe Leu Arg Asn Ala Glu Gln 85 90 95 agg tta gga ttt gaa caa gct ttt aaa agt agc agc aac agc aac gag 336 Arg Leu Gly Phe Glu Gln Ala Phe Lys Ser Ser Ser Asn Ser Asn Glu 100 105 110 aat acc gaa tac taa 351 Asn Thr Glu Tyr 115 28 116 PRT Synechocystis sp. strain PCC6803 28 Val Thr His Glu Pro Gln Arg Pro Gln Pro Leu Phe Ala Gly Asn Glu 1 5 10 15 Ala Pro Gly Lys Asp Ser Leu Trp Thr Tyr Val Gln Glu Leu Ser Pro 20 25 30 Glu Thr Ile Ala Gln Leu Ser Arg Pro Asp Ser Gln Glu Val Phe Gln 35 40 45 Val Met Glu Arg Asn Ile Ile Gly Leu Leu Gly Asn Leu Pro Pro Glu 50 55 60 His Phe Gly Val Thr Ile Ser Thr Ser Arg Glu Asn Leu Gly Arg Leu 65 70 75 80 Leu Ala Ser Ala Met Met Ser Gly Tyr Phe Leu Arg Asn Ala Glu Gln 85 90 95 Arg Leu Gly Phe Glu Gln Ala Phe Lys Ser Ser Ser Asn Ser Asn Glu 100 105 110 Asn Thr Glu Tyr 115 29 1851 DNA Synechocystis sp. strain PCC6803 CDS (1)..(1851) 29 gtg agc aaa aat aat aaa aaa tgg cgt aac gcg ggc cta tat gcc ttg 48 Val Ser Lys Asn Asn Lys Lys Trp Arg Asn Ala Gly Leu Tyr Ala Leu 1 5 10 15 ttg tta att gtc gtt tta gcg ttg gca tcg gcc ttt ttc gac cga ccg 96 Leu Leu Ile Val Val Leu Ala Leu Ala Ser Ala Phe Phe Asp Arg Pro 20 25 30 acc caa act agg gaa acc ctc agc tac agc gat ttt gtc aat cgg gta 144 Thr Gln Thr Arg Glu Thr Leu Ser Tyr Ser Asp Phe Val Asn Arg Val 35 40 45 gaa gcc aat cag atc gaa cgg gtc aac ctc agt gcc gac cgc acc caa 192 Glu Ala Asn Gln Ile Glu Arg Val Asn Leu Ser Ala Asp Arg Thr Gln 50 55 60 gcc caa gta ccc aat ccc agc ggt ggt cct ccc tac tta gtc aat ctg 240 Ala Gln Val Pro Asn Pro Ser Gly Gly Pro Pro Tyr Leu Val Asn Leu 65 70 75 80 ccc aac gac ccc gac ttg atc aat att ctc acc caa cac aac gtg gat 288 Pro Asn Asp Pro Asp Leu Ile Asn Ile Leu Thr Gln His Asn Val Asp 85 90 95 att gct gtc caa ccc cag agc gac gaa ggt ttc tgg ttc cgc atc gcc 336 Ile Ala Val Gln Pro Gln Ser Asp Glu Gly Phe Trp Phe Arg Ile Ala 100 105 110 agc acc cta ttt ttg ccc atc ttg ctc ttg gtg gga att ttt ttc ctc 384 Ser Thr Leu Phe Leu Pro Ile Leu Leu Leu Val Gly Ile Phe Phe Leu 115 120 125 ttc cgt cgg gcc cag agt ggc cct ggt tcc caa gcc atg aac ttt ggt 432 Phe Arg Arg Ala Gln Ser Gly Pro Gly Ser Gln Ala Met Asn Phe Gly 130 135 140 aaa tcc aaa gca cgg gtg caa atg gaa ccc caa acc caa gtt acc ttc 480 Lys Ser Lys Ala Arg Val Gln Met Glu Pro Gln Thr Gln Val Thr Phe 145 150 155 160 ggg gac gtg gcc ggt att gag caa gcc aaa cta gaa ctc acc gaa gtg 528 Gly Asp Val Ala Gly Ile Glu Gln Ala Lys Leu Glu Leu Thr Glu Val 165 170 175 gtg gac ttc ctg aaa aat gca gac cgc ttc acc gaa ttg gga gcc aaa 576 Val Asp Phe Leu Lys Asn Ala Asp Arg Phe Thr Glu Leu Gly Ala Lys 180 185 190 att ccc aag ggt gtt ttg ttg gta ggc ccc ccc gga acc ggt aaa acc 624 Ile Pro Lys Gly Val Leu Leu Val Gly Pro Pro Gly Thr Gly Lys Thr 195 200 205 ctg ttg gcc aaa gcc gtg gct ggg gaa gcg ggt gta ccg ttc ttt tcc 672 Leu Leu Ala Lys Ala Val Ala Gly Glu Ala Gly Val Pro Phe Phe Ser 210 215 220 atc tcc ggt tcg gaa ttt gtg gaa atg ttt gtc ggt gtt ggt gct tct 720 Ile Ser Gly Ser Glu Phe Val Glu Met Phe Val Gly Val Gly Ala Ser 225 230 235 240 cgg gta cgg gat ttg ttt gag cag gct aaa gcc aat gct ccc tgt atc 768 Arg Val Arg Asp Leu Phe Glu Gln Ala Lys Ala Asn Ala Pro Cys Ile 245 250 255 gtc ttc atc gat gaa att gat gcc gtt ggt cgt caa cgg ggc gct ggc 816 Val Phe Ile Asp Glu Ile Asp Ala Val Gly Arg Gln Arg Gly Ala Gly 260 265 270 ctt ggt ggt ggt aat gat gag cgg gaa cag acc ctc aac cag ttg cta 864 Leu Gly Gly Gly Asn Asp Glu Arg Glu Gln Thr Leu Asn Gln Leu Leu 275 280 285 acg gaa atg gac ggt ttt gaa ggc aac acc ggc att att atc gtc gcc 912 Thr Glu Met Asp Gly Phe Glu Gly Asn Thr Gly Ile Ile Ile Val Ala 290 295 300 gcc act aac cgt ccc gat gta ttg gat tct gcc ttg atg cgt ccc ggt 960 Ala Thr Asn Arg Pro Asp Val Leu Asp Ser Ala Leu Met Arg Pro Gly 305 310 315 320 cgt ttc gat cgc caa gtg gta gta gac cgt cct gat tat gct ggc cgt 1008 Arg Phe Asp Arg Gln Val Val Val Asp Arg Pro Asp Tyr Ala Gly Arg 325 330 335 cga gaa atc ctc aat gtc cat gcc cgg ggt aaa acc ctt tcc cag gat 1056 Arg Glu Ile Leu Asn Val His Ala Arg Gly Lys Thr Leu Ser Gln Asp 340 345 350 gtg gat ttg gat aaa att gcc cgt cgt acc cct gga ttt acc ggt gct 1104 Val Asp Leu Asp Lys Ile Ala Arg Arg Thr Pro Gly Phe Thr Gly Ala 355 360 365 gac ctg tcc aac ctg ttg aac gaa gcc gct att ttg gct gcc cgt cgc 1152 Asp Leu Ser Asn Leu Leu Asn Glu Ala Ala Ile Leu Ala Ala Arg Arg 370 375 380 aac ttg acc gaa att tcc atg gac gaa gtc aac gac gcc att gac cgg 1200 Asn Leu Thr Glu Ile Ser Met Asp Glu Val Asn Asp Ala Ile Asp Arg 385 390 395 400 gtg ttg gct ggt cct gag aag aaa aat cgg gtg atg agc gaa aaa cgc 1248 Val Leu Ala Gly Pro Glu Lys Lys Asn Arg Val Met Ser Glu Lys Arg 405 410 415 aaa acc cta gtg gct tac cat gaa gct ggc cac gcc ttg gtg ggt gct 1296 Lys Thr Leu Val Ala Tyr His Glu Ala Gly His Ala Leu Val Gly Ala 420 425 430 ttg atg cct gat tat gat cca gta caa aaa att agc att att ccc cgc 1344 Leu Met Pro Asp Tyr Asp Pro Val Gln Lys Ile Ser Ile Ile Pro Arg 435 440 445 ggc cgg gcc ggt ggt tta acc tgg ttc acc ccc agt gaa gac cgt atg 1392 Gly Arg Ala Gly Gly Leu Thr Trp Phe Thr Pro Ser Glu Asp Arg Met 450 455 460 gaa tcc ggt tta tac tcc cgt tcc tat ctg caa aat cag atg gcc gtt 1440 Glu Ser Gly Leu Tyr Ser Arg Ser Tyr Leu Gln Asn Gln Met Ala Val 465 470 475 480 gcc ctg gga ggc cgt att gct gag gaa att att ttc ggc gaa gag gaa 1488 Ala Leu Gly Gly Arg Ile Ala Glu Glu Ile Ile Phe Gly Glu Glu Glu 485 490 495 gtc acc acc ggt gct tcc aac gac ctc caa cag gta gcc cgg gtc gcc 1536 Val Thr Thr Gly Ala Ser Asn Asp Leu Gln Gln Val Ala Arg Val Ala 500 505 510 cgc caa atg gta acc cgt ttc ggc atg agc gat cgc ctg ggc ccg gta 1584 Arg Gln Met Val Thr Arg Phe Gly Met Ser Asp Arg Leu Gly Pro Val 515 520 525 gct ttg ggt cgt cag ggt ggt ggg gta ttc ctt ggt cgg gac att gcc 1632 Ala Leu Gly Arg Gln Gly Gly Gly Val Phe Leu Gly Arg Asp Ile Ala 530 535 540 tct gac cgg gac ttt tcc gat gaa acc gct gcg gcg atc gat gag gaa 1680 Ser Asp Arg Asp Phe Ser Asp Glu Thr Ala Ala Ala Ile Asp Glu Glu 545 550 555 560 gta agt caa ttg gta gac caa gcc tat caa cgg gcc aaa cag gtc ttg 1728 Val Ser Gln Leu Val Asp Gln Ala Tyr Gln Arg Ala Lys Gln Val Leu 565 570 575 gtg gaa aac cgt ggc att tta gat caa ctg gca gaa atc ttg gta gaa 1776 Val Glu Asn Arg Gly Ile Leu Asp Gln Leu Ala Glu Ile Leu Val Glu 580 585 590 aag gaa act gtt gat tct gaa gag ctg caa act ctc ctg gct aac aac 1824 Lys Glu Thr Val Asp Ser Glu Glu Leu Gln Thr Leu Leu Ala Asn Asn 595 600 605 aat gcc aaa ttg gca ctt cta gtt taa 1851 Asn Ala Lys Leu Ala Leu Leu Val 610 615 30 616 PRT Synechocystis sp. strain PCC6803 30 Val Ser Lys Asn Asn Lys Lys Trp Arg Asn Ala Gly Leu Tyr Ala Leu 1 5 10 15 Leu Leu Ile Val Val Leu Ala Leu Ala Ser Ala Phe Phe Asp Arg Pro 20 25 30 Thr Gln Thr Arg Glu Thr Leu Ser Tyr Ser Asp Phe Val Asn Arg Val 35 40 45 Glu Ala Asn Gln Ile Glu Arg Val Asn Leu Ser Ala Asp Arg Thr Gln 50 55 60 Ala Gln Val Pro Asn Pro Ser Gly Gly Pro Pro Tyr Leu Val Asn Leu 65 70 75 80 Pro Asn Asp Pro Asp Leu Ile Asn Ile Leu Thr Gln His Asn Val Asp 85 90 95 Ile Ala Val Gln Pro Gln Ser Asp Glu Gly Phe Trp Phe Arg Ile Ala 100 105 110 Ser Thr Leu Phe Leu Pro Ile Leu Leu Leu Val Gly Ile Phe Phe Leu 115 120 125 Phe Arg Arg Ala Gln Ser Gly Pro Gly Ser Gln Ala Met Asn Phe Gly 130 135 140 Lys Ser Lys Ala Arg Val Gln Met Glu Pro Gln Thr Gln Val Thr Phe 145 150 155 160 Gly Asp Val Ala Gly Ile Glu Gln Ala Lys Leu Glu Leu Thr Glu Val 165 170 175 Val Asp Phe Leu Lys Asn Ala Asp Arg Phe Thr Glu Leu Gly Ala Lys 180 185 190 Ile Pro Lys Gly Val Leu Leu Val Gly Pro Pro Gly Thr Gly Lys Thr 195 200 205 Leu Leu Ala Lys Ala Val Ala Gly Glu Ala Gly Val Pro Phe Phe Ser 210 215 220 Ile Ser Gly Ser Glu Phe Val Glu Met Phe Val Gly Val Gly Ala Ser 225 230 235 240 Arg Val Arg Asp Leu Phe Glu Gln Ala Lys Ala Asn Ala Pro Cys Ile 245 250 255 Val Phe Ile Asp Glu Ile Asp Ala Val Gly Arg Gln Arg Gly Ala Gly 260 265 270 Leu Gly Gly Gly Asn Asp Glu Arg Glu Gln Thr Leu Asn Gln Leu Leu 275 280 285 Thr Glu Met Asp Gly Phe Glu Gly Asn Thr Gly Ile Ile Ile Val Ala 290 295 300 Ala Thr Asn Arg Pro Asp Val Leu Asp Ser Ala Leu Met Arg Pro Gly 305 310 315 320 Arg Phe Asp Arg Gln Val Val Val Asp Arg Pro Asp Tyr Ala Gly Arg 325 330 335 Arg Glu Ile Leu Asn Val His Ala Arg Gly Lys Thr Leu Ser Gln Asp 340 345 350 Val Asp Leu Asp Lys Ile Ala Arg Arg Thr Pro Gly Phe Thr Gly Ala 355 360 365 Asp Leu Ser Asn Leu Leu Asn Glu Ala Ala Ile Leu Ala Ala Arg Arg 370 375 380 Asn Leu Thr Glu Ile Ser Met Asp Glu Val Asn Asp Ala Ile Asp Arg 385 390 395 400 Val Leu Ala Gly Pro Glu Lys Lys Asn Arg Val Met Ser Glu Lys Arg 405 410 415 Lys Thr Leu Val Ala Tyr His Glu Ala Gly His Ala Leu Val Gly Ala 420 425 430 Leu Met Pro Asp Tyr Asp Pro Val Gln Lys Ile Ser Ile Ile Pro Arg 435 440 445 Gly Arg Ala Gly Gly Leu Thr Trp Phe Thr Pro Ser Glu Asp Arg Met 450 455 460 Glu Ser Gly Leu Tyr Ser Arg Ser Tyr Leu Gln Asn Gln Met Ala Val 465 470 475 480 Ala Leu Gly Gly Arg Ile Ala Glu Glu Ile Ile Phe Gly Glu Glu Glu 485 490 495 Val Thr Thr Gly Ala Ser Asn Asp Leu Gln Gln Val Ala Arg Val Ala 500 505 510 Arg Gln Met Val Thr Arg Phe Gly Met Ser Asp Arg Leu Gly Pro Val 515 520 525 Ala Leu Gly Arg Gln Gly Gly Gly Val Phe Leu Gly Arg Asp Ile Ala 530 535 540 Ser Asp Arg Asp Phe Ser Asp Glu Thr Ala Ala Ala Ile Asp Glu Glu 545 550 555 560 Val Ser Gln Leu Val Asp Gln Ala Tyr Gln Arg Ala Lys Gln Val Leu 565 570 575 Val Glu Asn Arg Gly Ile Leu Asp Gln Leu Ala Glu Ile Leu Val Glu 580 585 590 Lys Glu Thr Val Asp Ser Glu Glu Leu Gln Thr Leu Leu Ala Asn Asn 595 600 605 Asn Ala Lys Leu Ala Leu Leu Val 610 615 31 1596 DNA Synechocystis sp. strain PCC6803 CDS (1)..(1596) 31 atg atc gat cgc ctt ttg tac gtt cgt ctt ccc tgt aac ccg att ttc 48 Met Ile Asp Arg Leu Leu Tyr Val Arg Leu Pro Cys Asn Pro Ile Phe 1 5 10 15 ccc att ggg gtg att tat ctg gcg gac cat gtc cat aaa tgt ttt ccg 96 Pro Ile Gly Val Ile Tyr Leu Ala Asp His Val His Lys Cys Phe Pro 20 25 30 gcg acc gcc cag cgg att ttc gat tta ggc acc att cct ccc ctg gat 144 Ala Thr Ala Gln Arg Ile Phe Asp Leu Gly Thr Ile Pro Pro Leu Asp 35 40 45 ttc aac cgg gcc ctt gat gaa tgt att gat gaa ttt cag ccg aca att 192 Phe Asn Arg Ala Leu Asp Glu Cys Ile Asp Glu Phe Gln Pro Thr Ile 50 55 60 ttg gtt ttt tcc tgg cgg gac att caa atc tat gct ccg gtg ggg ggt 240 Leu Val Phe Ser Trp Arg Asp Ile Gln Ile Tyr Ala Pro Val Gly Gly 65 70 75 80 agg ggt ggt aat ccc ctg cag aac gcg ttt gag ttt tac tac gga aaa 288 Arg Gly Gly Asn Pro Leu Gln Asn Ala Phe Glu Phe Tyr Tyr Gly Lys 85 90 95 aat ccc ttg gtg aag ctg agg gga gcc tta ggt ggt ttg aaa gtt acc 336 Asn Pro Leu Val Lys Leu Arg Gly Ala Leu Gly Gly Leu Lys Val Thr 100 105 110 agt gcc tat tac ggc gaa tta tgg cgt aat tta aga cta ata aac cgg 384 Ser Ala Tyr Tyr Gly Glu Leu Trp Arg Asn Leu Arg Leu Ile Asn Arg 115 120 125 gga ttg cgg agg gca aag cgt tat tgc agt gat ccc caa atc att gtc 432 Gly Leu Arg Arg Ala Lys Arg Tyr Cys Ser Asp Pro Gln Ile Ile Val 130 135 140 ggg ggc gga gca gtt agt gtt ttt tac gaa cag tta aaa acc aag ttg 480 Gly Gly Gly Ala Val Ser Val Phe Tyr Glu Gln Leu Lys Thr Lys Leu 145 150 155 160 cca gcg ggc acc att gtg tct gtg gga gaa ggg gaa acc ctg tta gaa 528 Pro Ala Gly Thr Ile Val Ser Val Gly Glu Gly Glu Thr Leu Leu Glu 165 170 175 aaa tat cta cgg ggg caa acc att gaa gac gaa cgg tgt tac ata gtc 576 Lys Tyr Leu Arg Gly Gln Thr Ile Glu Asp Glu Arg Cys Tyr Ile Val 180 185 190 ggc cgc agt cag ccc cgg ccc cgg tta atc cat gaa cag ccc tcc ccc 624 Gly Arg Ser Gln Pro Arg Pro Arg Leu Ile His Glu Gln Pro Ser Pro 195 200 205 atg gta aaa act gcc tgt gat tat gac tac atc gag caa att tgg ccg 672 Met Val Lys Thr Ala Cys Asp Tyr Asp Tyr Ile Glu Gln Ile Trp Pro 210 215 220 gcc ttt gac tat tac ctc cag gag gat gat ttt tac cta ggg gta caa 720 Ala Phe Asp Tyr Tyr Leu Gln Glu Asp Asp Phe Tyr Leu Gly Val Gln 225 230 235 240 act aag cgg ggt tgt ccc cac aat tgc tgt tac tgc gtt tac acc gtg 768 Thr Lys Arg Gly Cys Pro His Asn Cys Cys Tyr Cys Val Tyr Thr Val 245 250 255 gtg gaa ggg aaa cag gtc aga att aat ccc gcc gcc gaa gtg gtc aag 816 Val Glu Gly Lys Gln Val Arg Ile Asn Pro Ala Ala Glu Val Val Lys 260 265 270 gaa atg cgg caa ctt tat gac cgg ggc att cgc aat ttt tgg ttc acc 864 Glu Met Arg Gln Leu Tyr Asp Arg Gly Ile Arg Asn Phe Trp Phe Thr 275 280 285 gat gct caa ttt att ccg gct agg gtt ttt ata gat gat gtg gtg gaa 912 Asp Ala Gln Phe Ile Pro Ala Arg Val Phe Ile Asp Asp Val Val Glu 290 295 300 ttg ctg gag gcg atc gcc gcg tcg ggc atg gag gat atc cat tgg gct 960 Leu Leu Glu Ala Ile Ala Ala Ser Gly Met Glu Asp Ile His Trp Ala 305 310 315 320 gcc tat atc cga gct gac aat tta acc cct cgg ttg tgt gaa ctg atg 1008 Ala Tyr Ile Arg Ala Asp Asn Leu Thr Pro Arg Leu Cys Glu Leu Met 325 330 335 gta caa acg ggg atg aac tac ttt gaa att ggt atc acc agt ggt tcc 1056 Val Gln Thr Gly Met Asn Tyr Phe Glu Ile Gly Ile Thr Ser Gly Ser 340 345 350 cag gaa ttg gta cgc aaa atg cgc atg ggt tac aat ctc cgc acc gtg 1104 Gln Glu Leu Val Arg Lys Met Arg Met Gly Tyr Asn Leu Arg Thr Val 355 360 365 tta cag aat tgt cgg gat cta aag ggg gca ggc ttt aat gat ttg gtt 1152 Leu Gln Asn Cys Arg Asp Leu Lys Gly Ala Gly Phe Asn Asp Leu Val 370 375 380 tcc gtc aat tat tcc ttc aat gtt att gat gaa acc cta gac acc atc 1200 Ser Val Asn Tyr Ser Phe Asn Val Ile Asp Glu Thr Leu Asp Thr Ile 385 390 395 400 cgc caa acc att gcc tac cat cgg gag tta gaa gct att ttt ggg gca 1248 Arg Gln Thr Ile Ala Tyr His Arg Glu Leu Glu Ala Ile Phe Gly Ala 405 410 415 gac aaa gta gaa cca gcc att ttc ttc att ggg cta cag ccc cac acc 1296 Asp Lys Val Glu Pro Ala Ile Phe Phe Ile Gly Leu Gln Pro His Thr 420 425 430 cat ctg gaa acc tat gcc ctg gac aag gaa att ctc aaa cca ggc tat 1344 His Leu Glu Thr Tyr Ala Leu Asp Lys Glu Ile Leu Lys Pro Gly Tyr 435 440 445 gac ccc atg agc atg atg ccc tgg acc gcc aaa aaa tta ctc tgg aat 1392 Asp Pro Met Ser Met Met Pro Trp Thr Ala Lys Lys Leu Leu Trp Asn 450 455 460 cca gaa ccc ctc ggc tcg ttt ttc ggc gaa gtt tgc ctc cag gct tgg 1440 Pro Glu Pro Leu Gly Ser Phe Phe Gly Glu Val Cys Leu Gln Ala Trp 465 470 475 480 caa caa aat ccc aat gat ttc ggt cga gaa gtg atg aat att ctc gag 1488 Gln Gln Asn Pro Asn Asp Phe Gly Arg Glu Val Met Asn Ile Leu Glu 485 490 495 caa cgg ctg ggc aaa gct gat ctc gaa aca gcc ctc cac tcc ccc ctg 1536 Gln Arg Leu Gly Lys Ala Asp Leu Glu Thr Ala Leu His Ser Pro Leu 500 505 510 ccc gac aaa aaa aaa ttt ccc cct acc atg gca gag gga aaa aaa ctc 1584 Pro Asp Lys Lys Lys Phe Pro Pro Thr Met Ala Glu Gly Lys Lys Leu 515 520 525 agt cct att tag 1596 Ser Pro Ile 530 32 531 PRT Synechocystis sp. strain PCC6803 32 Met Ile Asp Arg Leu Leu Tyr Val Arg Leu Pro Cys Asn Pro Ile Phe 1 5 10 15 Pro Ile Gly Val Ile Tyr Leu Ala Asp His Val His Lys Cys Phe Pro 20 25 30 Ala Thr Ala Gln Arg Ile Phe Asp Leu Gly Thr Ile Pro Pro Leu Asp 35 40 45 Phe Asn Arg Ala Leu Asp Glu Cys Ile Asp Glu Phe Gln Pro Thr Ile 50 55 60 Leu Val Phe Ser Trp Arg Asp Ile Gln Ile Tyr Ala Pro Val Gly Gly 65 70 75 80 Arg Gly Gly Asn Pro Leu Gln Asn Ala Phe Glu Phe Tyr Tyr Gly Lys 85 90 95 Asn Pro Leu Val Lys Leu Arg Gly Ala Leu Gly Gly Leu Lys Val Thr 100 105 110 Ser Ala Tyr Tyr Gly Glu Leu Trp Arg Asn Leu Arg Leu Ile Asn Arg 115 120 125 Gly Leu Arg Arg Ala Lys Arg Tyr Cys Ser Asp Pro Gln Ile Ile Val 130 135 140 Gly Gly Gly Ala Val Ser Val Phe Tyr Glu Gln Leu Lys Thr Lys Leu 145 150 155 160 Pro Ala Gly Thr Ile Val Ser Val Gly Glu Gly Glu Thr Leu Leu Glu 165 170 175 Lys Tyr Leu Arg Gly Gln Thr Ile Glu Asp Glu Arg Cys Tyr Ile Val 180 185 190 Gly Arg Ser Gln Pro Arg Pro Arg Leu Ile His Glu Gln Pro Ser Pro 195 200 205 Met Val Lys Thr Ala Cys Asp Tyr Asp Tyr Ile Glu Gln Ile Trp Pro 210 215 220 Ala Phe Asp Tyr Tyr Leu Gln Glu Asp Asp Phe Tyr Leu Gly Val Gln 225 230 235 240 Thr Lys Arg Gly Cys Pro His Asn Cys Cys Tyr Cys Val Tyr Thr Val 245 250 255 Val Glu Gly Lys Gln Val Arg Ile Asn Pro Ala Ala Glu Val Val Lys 260 265 270 Glu Met Arg Gln Leu Tyr Asp Arg Gly Ile Arg Asn Phe Trp Phe Thr 275 280 285 Asp Ala Gln Phe Ile Pro Ala Arg Val Phe Ile Asp Asp Val Val Glu 290 295 300 Leu Leu Glu Ala Ile Ala Ala Ser Gly Met Glu Asp Ile His Trp Ala 305 310 315 320 Ala Tyr Ile Arg Ala Asp Asn Leu Thr Pro Arg Leu Cys Glu Leu Met 325 330 335 Val Gln Thr Gly Met Asn Tyr Phe Glu Ile Gly Ile Thr Ser Gly Ser 340 345 350 Gln Glu Leu Val Arg Lys Met Arg Met Gly Tyr Asn Leu Arg Thr Val 355 360 365 Leu Gln Asn Cys Arg Asp Leu Lys Gly Ala Gly Phe Asn Asp Leu Val 370 375 380 Ser Val Asn Tyr Ser Phe Asn Val Ile Asp Glu Thr Leu Asp Thr Ile 385 390 395 400 Arg Gln Thr Ile Ala Tyr His Arg Glu Leu Glu Ala Ile Phe Gly Ala 405 410 415 Asp Lys Val Glu Pro Ala Ile Phe Phe Ile Gly Leu Gln Pro His Thr 420 425 430 His Leu Glu Thr Tyr Ala Leu Asp Lys Glu Ile Leu Lys Pro Gly Tyr 435 440 445 Asp Pro Met Ser Met Met Pro Trp Thr Ala Lys Lys Leu Leu Trp Asn 450 455 460 Pro Glu Pro Leu Gly Ser Phe Phe Gly Glu Val Cys Leu Gln Ala Trp 465 470 475 480 Gln Gln Asn Pro Asn Asp Phe Gly Arg Glu Val Met Asn Ile Leu Glu 485 490 495 Gln Arg Leu Gly Lys Ala Asp Leu Glu Thr Ala Leu His Ser Pro Leu 500 505 510 Pro Asp Lys Lys Lys Phe Pro Pro Thr Met Ala Glu Gly Lys Lys Leu 515 520 525 Ser Pro Ile 530 33 1038 DNA Synechocystis sp. strain PCC6803 CDS (1)..(1038) 33 atg gta aca gtg aca gtt att ctg ttg ctc ttc att aag gag tca ttc 48 Met Val Thr Val Thr Val Ile Leu Leu Leu Phe Ile Lys Glu Ser Phe 1 5 10 15 cga atg ccc acc gcc aat ctc tcc tcc cct acc tcc ccc ccc act ttc 96 Arg Met Pro Thr Ala Asn Leu Ser Ser Pro Thr Ser Pro Pro Thr Phe 20 25 30 acc gcc gat atg gtg agg tcc tat ctc cat gaa att ggt cgg gta ccc 144 Thr Ala Asp Met Val Arg Ser Tyr Leu His Glu Ile Gly Arg Val Pro 35 40 45 ctg tta acc cat gag caa gaa att atc ctc ggt aaa caa gtc caa caa 192 Leu Leu Thr His Glu Gln Glu Ile Ile Leu Gly Lys Gln Val Gln Gln 50 55 60 atg atg gcc ctg ctg gag cac aag aaa gcc ctg gct gac aga ttg ggc 240 Met Met Ala Leu Leu Glu His Lys Lys Ala Leu Ala Asp Arg Leu Gly 65 70 75 80 cga gag ccc tcc gac ccg gaa tgg gcg gaa gcg gcg gat ttg tcg gtg 288 Arg Glu Pro Ser Asp Pro Glu Trp Ala Glu Ala Ala Asp Leu Ser Val 85 90 95 acg aaa tta cac cgc tat ctg ggc caa ggg gaa cgg gcc aaa cgg aaa 336 Thr Lys Leu His Arg Tyr Leu Gly Gln Gly Glu Arg Ala Lys Arg Lys 100 105 110 atg att gaa gct aac ctc cgg ttg gtg gtg gcg atc gcc aag aaa tat 384 Met Ile Glu Ala Asn Leu Arg Leu Val Val Ala Ile Ala Lys Lys Tyr 115 120 125 cag aag cgc aat atg gag ttt ttg gat ttg atc caa gaa ggt agc ctg 432 Gln Lys Arg Asn Met Glu Phe Leu Asp Leu Ile Gln Glu Gly Ser Leu 130 135 140 ggt tta gaa cgg ggg gtg gaa aaa ttc gac ccc acc aag ggt tat aaa 480 Gly Leu Glu Arg Gly Val Glu Lys Phe Asp Pro Thr Lys Gly Tyr Lys 145 150 155 160 ttc tcc acc tat gcc tac tgg tgg att cgc caa gcc atc acc cgg gcg 528 Phe Ser Thr Tyr Ala Tyr Trp Trp Ile Arg Gln Ala Ile Thr Arg Ala 165 170 175 atc gcc caa cag ggc cgg act atc cgt ttg ccc att cat atc act gaa 576 Ile Ala Gln Gln Gly Arg Thr Ile Arg Leu Pro Ile His Ile Thr Glu 180 185 190 aag tta aac aaa atc aaa aaa acc caa cgg gaa ctt tcc caa caa ttg 624 Lys Leu Asn Lys Ile Lys Lys Thr Gln Arg Glu Leu Ser Gln Gln Leu 195 200 205 ggc cgc agt gcc acc ccc gcc gaa gta gcc aag gct ctg gaa att gac 672 Gly Arg Ser Ala Thr Pro Ala Glu Val Ala Lys Ala Leu Glu Ile Asp 210 215 220 cct agt caa att cgc gag tac ctc agt ctg tcg cgc caa ccc atc tcc 720 Pro Ser Gln Ile Arg Glu Tyr Leu Ser Leu Ser Arg Gln Pro Ile Ser 225 230 235 240 ctc gat gtg cgg gtg ggg gat aat cag gac aca gaa ttg tcc gaa ctc 768 Leu Asp Val Arg Val Gly Asp Asn Gln Asp Thr Glu Leu Ser Glu Leu 245 250 255 ttg gag gac gaa ggg gtt tcc ccc gat gct tac atc acc cag gag tcc 816 Leu Glu Asp Glu Gly Val Ser Pro Asp Ala Tyr Ile Thr Gln Glu Ser 260 265 270 atg cgt caa gac ctg caa aat tta ctg gcg gaa tta aca ccc cag caa 864 Met Arg Gln Asp Leu Gln Asn Leu Leu Ala Glu Leu Thr Pro Gln Gln 275 280 285 cag gct gtg ctg acc atg cgt ttt ggt ctt aac gat ggc caa gag cta 912 Gln Ala Val Leu Thr Met Arg Phe Gly Leu Asn Asp Gly Gln Glu Leu 290 295 300 tct ttg gct aaa atc ggc cag cat ctc aac atc agc cgg gaa agg gtc 960 Ser Leu Ala Lys Ile Gly Gln His Leu Asn Ile Ser Arg Glu Arg Val 305 310 315 320 cgc caa tta gaa aac caa gcc ctt gcg caa ctg aag cgt cgg cgg gct 1008 Arg Gln Leu Glu Asn Gln Ala Leu Ala Gln Leu Lys Arg Arg Arg Ala 325 330 335 aat atg gca gag tat att atc gcc agt tag 1038 Asn Met Ala Glu Tyr Ile Ile Ala Ser 340 345 34 345 PRT Synechocystis sp. strain PCC6803 34 Met Val Thr Val Thr Val Ile Leu Leu Leu Phe Ile Lys Glu Ser Phe 1 5 10 15 Arg Met Pro Thr Ala Asn Leu Ser Ser Pro Thr Ser Pro Pro Thr Phe 20 25 30 Thr Ala Asp Met Val Arg Ser Tyr Leu His Glu Ile Gly Arg Val Pro 35 40 45 Leu Leu Thr His Glu Gln Glu Ile Ile Leu Gly Lys Gln Val Gln Gln 50 55 60 Met Met Ala Leu Leu Glu His Lys Lys Ala Leu Ala Asp Arg Leu Gly 65 70 75 80 Arg Glu Pro Ser Asp Pro Glu Trp Ala Glu Ala Ala Asp Leu Ser Val 85 90 95 Thr Lys Leu His Arg Tyr Leu Gly Gln Gly Glu Arg Ala Lys Arg Lys 100 105 110 Met Ile Glu Ala Asn Leu Arg Leu Val Val Ala Ile Ala Lys Lys Tyr 115 120 125 Gln Lys Arg Asn Met Glu Phe Leu Asp Leu Ile Gln Glu Gly Ser Leu 130 135 140 Gly Leu Glu Arg Gly Val Glu Lys Phe Asp Pro Thr Lys Gly Tyr Lys 145 150 155 160 Phe Ser Thr Tyr Ala Tyr Trp Trp Ile Arg Gln Ala Ile Thr Arg Ala 165 170 175 Ile Ala Gln Gln Gly Arg Thr Ile Arg Leu Pro Ile His Ile Thr Glu 180 185 190 Lys Leu Asn Lys Ile Lys Lys Thr Gln Arg Glu Leu Ser Gln Gln Leu 195 200 205 Gly Arg Ser Ala Thr Pro Ala Glu Val Ala Lys Ala Leu Glu Ile Asp 210 215 220 Pro Ser Gln Ile Arg Glu Tyr Leu Ser Leu Ser Arg Gln Pro Ile Ser 225 230 235 240 Leu Asp Val Arg Val Gly Asp Asn Gln Asp Thr Glu Leu Ser Glu Leu 245 250 255 Leu Glu Asp Glu Gly Val Ser Pro Asp Ala Tyr Ile Thr Gln Glu Ser 260 265 270 Met Arg Gln Asp Leu Gln Asn Leu Leu Ala Glu Leu Thr Pro Gln Gln 275 280 285 Gln Ala Val Leu Thr Met Arg Phe Gly Leu Asn Asp Gly Gln Glu Leu 290 295 300 Ser Leu Ala Lys Ile Gly Gln His Leu Asn Ile Ser Arg Glu Arg Val 305 310 315 320 Arg Gln Leu Glu Asn Gln Ala Leu Ala Gln Leu Lys Arg Arg Arg Ala 325 330 335 Asn Met Ala Glu Tyr Ile Ile Ala Ser 340 345 35 1884 DNA Synechocystis sp. strain PCC6803 CDS (1)..(1884) 35 atg aaa ttt tcc tgg aga act gcc cta ctt tgg tcc cta ccc ctg ttg 48 Met Lys Phe Ser Trp Arg Thr Ala Leu Leu Trp Ser Leu Pro Leu Leu 1 5 10 15 gta gtc ggc ttt ttc ttc tgg cag ggg agc ttt gga ggg gca gat gcc 96 Val Val Gly Phe Phe Phe Trp Gln Gly Ser Phe Gly Gly Ala Asp Ala 20 25 30 aac ctc ggt tcc aac act gcc aac acc cgc atg acc tat ggt cgc ttc 144 Asn Leu Gly Ser Asn Thr Ala Asn Thr Arg Met Thr Tyr Gly Arg Phe 35 40 45 ctc gaa tat gtg gat gct ggc cgc atc acc agt gtg gat tta tat gaa 192 Leu Glu Tyr Val Asp Ala Gly Arg Ile Thr Ser Val Asp Leu Tyr Glu 50 55 60 aat ggc cgc acg gcg atc gtg caa gtt agc gac cca gaa gta gac cgg 240 Asn Gly Arg Thr Ala Ile Val Gln Val Ser Asp Pro Glu Val Asp Arg 65 70 75 80 acc ctc cgt tcc cgg gtt gac ctc ccc acc aat gcc ccg gaa ttg att 288 Thr Leu Arg Ser Arg Val Asp Leu Pro Thr Asn Ala Pro Glu Leu Ile 85 90 95 gcc cgt tta cgg gac tcc aac att cgc ctt gat tcc cac cct gtc cgc 336 Ala Arg Leu Arg Asp Ser Asn Ile Arg Leu Asp Ser His Pro Val Arg 100 105 110 aac aat ggc atg gtt tgg ggt ttt gtg ggc aac ttg att ttc ccc gtg 384 Asn Asn Gly Met Val Trp Gly Phe Val Gly Asn Leu Ile Phe Pro Val 115 120 125 ctt ttg att gct tcc ctc ttt ttt ctc ttc cgc cgt tcc agc aac atg 432 Leu Leu Ile Ala Ser Leu Phe Phe Leu Phe Arg Arg Ser Ser Asn Met 130 135 140 cct ggg ggc ccc ggc caa gcc atg aac ttt ggt aaa tcc aaa gct cgc 480 Pro Gly Gly Pro Gly Gln Ala Met Asn Phe Gly Lys Ser Lys Ala Arg 145 150 155 160 ttc caa atg gat gcc aaa acc ggt gtc atg ttc gat gat gtg gcc ggt 528 Phe Gln Met Asp Ala Lys Thr Gly Val Met Phe Asp Asp Val Ala Gly 165 170 175 att gac gaa gcc aag gaa gaa ttg caa gag gtg gta act ttc ctt aaa 576 Ile Asp Glu Ala Lys Glu Glu Leu Gln Glu Val Val Thr Phe Leu Lys 180 185 190 cag ccc gaa cgc ttt act gca gtg ggg gcc aag att ccc aaa gga gta 624 Gln Pro Glu Arg Phe Thr Ala Val Gly Ala Lys Ile Pro Lys Gly Val 195 200 205 ctc tta gtg ggc cct ccc ggt acc ggt aaa act ctc ctc gcc aag gcg 672 Leu Leu Val Gly Pro Pro Gly Thr Gly Lys Thr Leu Leu Ala Lys Ala 210 215 220 atc gcc ggg gaa gcc gga gtt cct ttc ttc agc att tct ggt tcc gag 720 Ile Ala Gly Glu Ala Gly Val Pro Phe Phe Ser Ile Ser Gly Ser Glu 225 230 235 240 ttc gta gaa atg ttt gtc ggc gtt ggt gcc tcc cgg gtg cgg gac ttg 768 Phe Val Glu Met Phe Val Gly Val Gly Ala Ser Arg Val Arg Asp Leu 245 250 255 ttt aaa aaa gcc aaa gag aat gcc ccc tgt ttg atc ttc att gat gag 816 Phe Lys Lys Ala Lys Glu Asn Ala Pro Cys Leu Ile Phe Ile Asp Glu 260 265 270 att gat gcc gtg ggt cgt caa cgg ggt gct ggt atc ggt ggt ggt aac 864 Ile Asp Ala Val Gly Arg Gln Arg Gly Ala Gly Ile Gly Gly Gly Asn 275 280 285 gat gaa cgg gaa caa acc ctc aac cag cta cta acc gag atg gac ggt 912 Asp Glu Arg Glu Gln Thr Leu Asn Gln Leu Leu Thr Glu Met Asp Gly 290 295 300 ttt gaa ggc aat acg ggc att att atc att gcc gcc act aac cgc cct 960 Phe Glu Gly Asn Thr Gly Ile Ile Ile Ile Ala Ala Thr Asn Arg Pro 305 310 315 320 gac gtg cta gat tct gcc ttg atg cgt ccc ggt cgt ttc gat cgc caa 1008 Asp Val Leu Asp Ser Ala Leu Met Arg Pro Gly Arg Phe Asp Arg Gln 325 330 335 gtg atg gtg gat gcc cct gac tac tct ggt cgt aag gaa att tta gaa 1056 Val Met Val Asp Ala Pro Asp Tyr Ser Gly Arg Lys Glu Ile Leu Glu 340 345 350 gtc cac gcc cgc aat aaa aag tta gca ccg gaa gtt tcc atc gac tcc 1104 Val His Ala Arg Asn Lys Lys Leu Ala Pro Glu Val Ser Ile Asp Ser 355 360 365 att gcc cgc cgt act ccc ggt ttt agt ggg gct gac ttg gcc aat tta 1152 Ile Ala Arg Arg Thr Pro Gly Phe Ser Gly Ala Asp Leu Ala Asn Leu 370 375 380 ttg aat gaa gcc gcc att ctc acc gcc cgc cgt cgt aaa tcc gct atc 1200 Leu Asn Glu Ala Ala Ile Leu Thr Ala Arg Arg Arg Lys Ser Ala Ile 385 390 395 400 act ctg ttg gaa att gat gat gcc gtg gac cgg gtg gta gct ggt atg 1248 Thr Leu Leu Glu Ile Asp Asp Ala Val Asp Arg Val Val Ala Gly Met 405 410 415 gaa ggc acc ccc ttg gtg gac agc aaa agt aag cgg cta att gct tat 1296 Glu Gly Thr Pro Leu Val Asp Ser Lys Ser Lys Arg Leu Ile Ala Tyr 420 425 430 cac gaa gta ggc cac gcc att gtg ggc aca ttg tta aaa gac cat gat 1344 His Glu Val Gly His Ala Ile Val Gly Thr Leu Leu Lys Asp His Asp 435 440 445 ccc gtg caa aaa gtc acc ctt att cct cgg ggc caa gcc caa ggt ttg 1392 Pro Val Gln Lys Val Thr Leu Ile Pro Arg Gly Gln Ala Gln Gly Leu 450 455 460 acc tgg ttc act ccc aac gaa gaa cag ggt tta acc acc aaa gcc caa 1440 Thr Trp Phe Thr Pro Asn Glu Glu Gln Gly Leu Thr Thr Lys Ala Gln 465 470 475 480 ctg atg gcc cgt att gct gga gca atg ggc ggt cga gcc gct gaa gag 1488 Leu Met Ala Arg Ile Ala Gly Ala Met Gly Gly Arg Ala Ala Glu Glu 485 490 495 gaa gtt ttt ggc gat gac gaa gta acc act ggg gct ggt ggt gac cta 1536 Glu Val Phe Gly Asp Asp Glu Val Thr Thr Gly Ala Gly Gly Asp Leu 500 505 510 caa cag gta act gag atg gct cgc cag atg gta act cgt ttt ggc atg 1584 Gln Gln Val Thr Glu Met Ala Arg Gln Met Val Thr Arg Phe Gly Met 515 520 525 agc aac ctt ggt ccc att tcc ctg gag agt tca ggt ggg gaa gta ttc 1632 Ser Asn Leu Gly Pro Ile Ser Leu Glu Ser Ser Gly Gly Glu Val Phe 530 535 540 ctg ggt ggt ggc ttg atg aac cgt tct gaa tac tcc gaa gaa gta gcc 1680 Leu Gly Gly Gly Leu Met Asn Arg Ser Glu Tyr Ser Glu Glu Val Ala 545 550 555 560 acc cgc att gat gcc caa gta cgg caa ttg gct gaa cag ggt cac caa 1728 Thr Arg Ile Asp Ala Gln Val Arg Gln Leu Ala Glu Gln Gly His Gln 565 570 575 atg gct cgc aaa atc gtc caa gaa caa cgg gaa gtg gtt gat cgc ctg 1776 Met Ala Arg Lys Ile Val Gln Glu Gln Arg Glu Val Val Asp Arg Leu 580 585 590 gtg gat ctt tta att gag aaa gaa acc att gat ggg gaa gaa ttt cgg 1824 Val Asp Leu Leu Ile Glu Lys Glu Thr Ile Asp Gly Glu Glu Phe Arg 595 600 605 caa att gtg gcg gaa tac gcc gag gtt ccc gtc aag gaa cag tta att 1872 Gln Ile Val Ala Glu Tyr Ala Glu Val Pro Val Lys Glu Gln Leu Ile 610 615 620 ccc caa cta taa 1884 Pro Gln Leu 625 36 627 PRT Synechocystis sp. strain PCC6803 36 Met Lys Phe Ser Trp Arg Thr Ala Leu Leu Trp Ser Leu Pro Leu Leu 1 5 10 15 Val Val Gly Phe Phe Phe Trp Gln Gly Ser Phe Gly Gly Ala Asp Ala 20 25 30 Asn Leu Gly Ser Asn Thr Ala Asn Thr Arg Met Thr Tyr Gly Arg Phe 35 40 45 Leu Glu Tyr Val Asp Ala Gly Arg Ile Thr Ser Val Asp Leu Tyr Glu 50 55 60 Asn Gly Arg Thr Ala Ile Val Gln Val Ser Asp Pro Glu Val Asp Arg 65 70 75 80 Thr Leu Arg Ser Arg Val Asp Leu Pro Thr Asn Ala Pro Glu Leu Ile 85 90 95 Ala Arg Leu Arg Asp Ser Asn Ile Arg Leu Asp Ser His Pro Val Arg 100 105 110 Asn Asn Gly Met Val Trp Gly Phe Val Gly Asn Leu Ile Phe Pro Val 115 120 125 Leu Leu Ile Ala Ser Leu Phe Phe Leu Phe Arg Arg Ser Ser Asn Met 130 135 140 Pro Gly Gly Pro Gly Gln Ala Met Asn Phe Gly Lys Ser Lys Ala Arg 145 150 155 160 Phe Gln Met Asp Ala Lys Thr Gly Val Met Phe Asp Asp Val Ala Gly 165 170 175 Ile Asp Glu Ala Lys Glu Glu Leu Gln Glu Val Val Thr Phe Leu Lys 180 185 190 Gln Pro Glu Arg Phe Thr Ala Val Gly Ala Lys Ile Pro Lys Gly Val 195 200 205 Leu Leu Val Gly Pro Pro Gly Thr Gly Lys Thr Leu Leu Ala Lys Ala 210 215 220 Ile Ala Gly Glu Ala Gly Val Pro Phe Phe Ser Ile Ser Gly Ser Glu 225 230 235 240 Phe Val Glu Met Phe Val Gly Val Gly Ala Ser Arg Val Arg Asp Leu 245 250 255 Phe Lys Lys Ala Lys Glu Asn Ala Pro Cys Leu Ile Phe Ile Asp Glu 260 265 270 Ile Asp Ala Val Gly Arg Gln Arg Gly Ala Gly Ile Gly Gly Gly Asn 275 280 285 Asp Glu Arg Glu Gln Thr Leu Asn Gln Leu Leu Thr Glu Met Asp Gly 290 295 300 Phe Glu Gly Asn Thr Gly Ile Ile Ile Ile Ala Ala Thr Asn Arg Pro 305 310 315 320 Asp Val Leu Asp Ser Ala Leu Met Arg Pro Gly Arg Phe Asp Arg Gln 325 330 335 Val Met Val Asp Ala Pro Asp Tyr Ser Gly Arg Lys Glu Ile Leu Glu 340 345 350 Val His Ala Arg Asn Lys Lys Leu Ala Pro Glu Val Ser Ile Asp Ser 355 360 365 Ile Ala Arg Arg Thr Pro Gly Phe Ser Gly Ala Asp Leu Ala Asn Leu 370 375 380 Leu Asn Glu Ala Ala Ile Leu Thr Ala Arg Arg Arg Lys Ser Ala Ile 385 390 395 400 Thr Leu Leu Glu Ile Asp Asp Ala Val Asp Arg Val Val Ala Gly Met 405 410 415 Glu Gly Thr Pro Leu Val Asp Ser Lys Ser Lys Arg Leu Ile Ala Tyr 420 425 430 His Glu Val Gly His Ala Ile Val Gly Thr Leu Leu Lys Asp His Asp 435 440 445 Pro Val Gln Lys Val Thr Leu Ile Pro Arg Gly Gln Ala Gln Gly Leu 450 455 460 Thr Trp Phe Thr Pro Asn Glu Glu Gln Gly Leu Thr Thr Lys Ala Gln 465 470 475 480 Leu Met Ala Arg Ile Ala Gly Ala Met Gly Gly Arg Ala Ala Glu Glu 485 490 495 Glu Val Phe Gly Asp Asp Glu Val Thr Thr Gly Ala Gly Gly Asp Leu 500 505 510 Gln Gln Val Thr Glu Met Ala Arg Gln Met Val Thr Arg Phe Gly Met 515 520 525 Ser Asn Leu Gly Pro Ile Ser Leu Glu Ser Ser Gly Gly Glu Val Phe 530 535 540 Leu Gly Gly Gly Leu Met Asn Arg Ser Glu Tyr Ser Glu Glu Val Ala 545 550 555 560 Thr Arg Ile Asp Ala Gln Val Arg Gln Leu Ala Glu Gln Gly His Gln 565 570 575 Met Ala Arg Lys Ile Val Gln Glu Gln Arg Glu Val Val Asp Arg Leu 580 585 590 Val Asp Leu Leu Ile Glu Lys Glu Thr Ile Asp Gly Glu Glu Phe Arg 595 600 605 Gln Ile Val Ala Glu Tyr Ala Glu Val Pro Val Lys Glu Gln Leu Ile 610 615 620 Pro Gln Leu 625 37 2619 DNA Synechocystis sp. strain PCC6803 CDS (1)..(2619) 37 atg caa ccc aca gat cct aat aaa ttt acg gag aaa gct tgg gag gcg 48 Met Gln Pro Thr Asp Pro Asn Lys Phe Thr Glu Lys Ala Trp Glu Ala 1 5 10 15 atc gcc aaa aca ccg gag att gct aaa cag cat cga caa cag caa att 96 Ile Ala Lys Thr Pro Glu Ile Ala Lys Gln His Arg Gln Gln Gln Ile 20 25 30 gag acg gaa cac cta ctc agt gcc cta cta gaa caa aat ggt ctg gcc 144 Glu Thr Glu His Leu Leu Ser Ala Leu Leu Glu Gln Asn Gly Leu Ala 35 40 45 acc agc atc ttt aat aag gct ggg gcg agc att ccc cga gtt aac gat 192 Thr Ser Ile Phe Asn Lys Ala Gly Ala Ser Ile Pro Arg Val Asn Asp 50 55 60 caa gtt aat agc ttt att gcc caa cag cca aaa tta agt aat ccg agt 240 Gln Val Asn Ser Phe Ile Ala Gln Gln Pro Lys Leu Ser Asn Pro Ser 65 70 75 80 gaa tcg att tat tta ggc cgc agt ctc gat aaa ttg ttg gac aat gcg 288 Glu Ser Ile Tyr Leu Gly Arg Ser Leu Asp Lys Leu Leu Asp Asn Ala 85 90 95 gaa ata gcc aag tct aaa tat gga gac gac tat att tcc atc gag cac 336 Glu Ile Ala Lys Ser Lys Tyr Gly Asp Asp Tyr Ile Ser Ile Glu His 100 105 110 ttg atg gcg gct tac ggc caa gat gac cgc ctg ggc aaa aac tta tat 384 Leu Met Ala Ala Tyr Gly Gln Asp Asp Arg Leu Gly Lys Asn Leu Tyr 115 120 125 cga gaa att ggc cta aca gaa aat aag ttg gca gaa att atc aag caa 432 Arg Glu Ile Gly Leu Thr Glu Asn Lys Leu Ala Glu Ile Ile Lys Gln 130 135 140 att aga gga acc caa aaa gtg acc gat caa aat cca gag ggc aaa tac 480 Ile Arg Gly Thr Gln Lys Val Thr Asp Gln Asn Pro Glu Gly Lys Tyr 145 150 155 160 gaa tcc ctt gaa aaa tat ggg cga gat tta acg gaa tta gcc cgg gaa 528 Glu Ser Leu Glu Lys Tyr Gly Arg Asp Leu Thr Glu Leu Ala Arg Glu 165 170 175 ggt aaa cta gat cct gtc att ggc cgg gat gaa gaa gtg cgg cgc acc 576 Gly Lys Leu Asp Pro Val Ile Gly Arg Asp Glu Glu Val Arg Arg Thr 180 185 190 att cag atc ctt tcc cgc cgc aca aaa aat aac cct gtg tta att ggg 624 Ile Gln Ile Leu Ser Arg Arg Thr Lys Asn Asn Pro Val Leu Ile Gly 195 200 205 gaa ccg ggg gtt ggt aaa acg gcg atc gcc gaa ggt tta gcc caa aga 672 Glu Pro Gly Val Gly Lys Thr Ala Ile Ala Glu Gly Leu Ala Gln Arg 210 215 220 att att aac cat gac gta ccg gaa tca ttg cgg gat cgc aaa cta att 720 Ile Ile Asn His Asp Val Pro Glu Ser Leu Arg Asp Arg Lys Leu Ile 225 230 235 240 tcc ctc gat atg ggg gcg tta att gcc ggg gca aaa tac cgg ggg gaa 768 Ser Leu Asp Met Gly Ala Leu Ile Ala Gly Ala Lys Tyr Arg Gly Glu 245 250 255 ttt gaa gaa aga ctt aaa gcg gta ctt aaa gaa gtt acc gac agc cag 816 Phe Glu Glu Arg Leu Lys Ala Val Leu Lys Glu Val Thr Asp Ser Gln 260 265 270 ggg caa att att ctc ttt att gac gaa att cat acc gtt gtc ggc gct 864 Gly Gln Ile Ile Leu Phe Ile Asp Glu Ile His Thr Val Val Gly Ala 275 280 285 ggg gcc acc caa gga gcc atg gat gcg ggc aac tta ttg aaa ccc atg 912 Gly Ala Thr Gln Gly Ala Met Asp Ala Gly Asn Leu Leu Lys Pro Met 290 295 300 tta gcc cgg ggt gct ttg cgt tgt atc ggg gcc acc act tta gat gaa 960 Leu Ala Arg Gly Ala Leu Arg Cys Ile Gly Ala Thr Thr Leu Asp Glu 305 310 315 320 tat cgc aaa tat atc gaa aaa gat gcg gct ttg gaa cga cgt ttc cag 1008 Tyr Arg Lys Tyr Ile Glu Lys Asp Ala Ala Leu Glu Arg Arg Phe Gln 325 330 335 gaa gtt tta gtg gat gaa ccc aat gta tta gat acc att tcc att ctc 1056 Glu Val Leu Val Asp Glu Pro Asn Val Leu Asp Thr Ile Ser Ile Leu 340 345 350 cgg gga tta aaa gaa cgc tat gaa gta cac cac ggc gta aaa att gcc 1104 Arg Gly Leu Lys Glu Arg Tyr Glu Val His His Gly Val Lys Ile Ala 355 360 365 gat agt gcc ctg gta gcg gcg gcc atg ttg tcc aat cgt tac atc agt 1152 Asp Ser Ala Leu Val Ala Ala Ala Met Leu Ser Asn Arg Tyr Ile Ser 370 375 380 gat cgt ttt ctg ccg gat aaa gct att gat tta gta gac gaa gca gcg 1200 Asp Arg Phe Leu Pro Asp Lys Ala Ile Asp Leu Val Asp Glu Ala Ala 385 390 395 400 gcc aaa tta aaa atg gaa atc acc tcc aaa cca gag gaa tta gat gaa 1248 Ala Lys Leu Lys Met Glu Ile Thr Ser Lys Pro Glu Glu Leu Asp Glu 405 410 415 gtt gac cgg aaa att ctc caa cta gaa atg gag cgt tta tct tta caa 1296 Val Asp Arg Lys Ile Leu Gln Leu Glu Met Glu Arg Leu Ser Leu Gln 420 425 430 cgg gaa aat gat tct gct tcc aag gag cgg cta gaa aaa ttg gag aaa 1344 Arg Glu Asn Asp Ser Ala Ser Lys Glu Arg Leu Glu Lys Leu Glu Lys 435 440 445 gag ttg gct gat ttt aaa gaa gaa cag tct aaa ctt aat ggc caa tgg 1392 Glu Leu Ala Asp Phe Lys Glu Glu Gln Ser Lys Leu Asn Gly Gln Trp 450 455 460 cag tcg gaa aaa acg gtt att gat caa att cgt act gtt aag gaa acc 1440 Gln Ser Glu Lys Thr Val Ile Asp Gln Ile Arg Thr Val Lys Glu Thr 465 470 475 480 atc gac cag gtg aac cta gaa att caa cag gcc caa cgg gat tac gac 1488 Ile Asp Gln Val Asn Leu Glu Ile Gln Gln Ala Gln Arg Asp Tyr Asp 485 490 495 tac aat aaa gca gcg gag tta cag tat ggc aaa tta act gat tta cag 1536 Tyr Asn Lys Ala Ala Glu Leu Gln Tyr Gly Lys Leu Thr Asp Leu Gln 500 505 510 cgg caa gtg gaa gct ttg gaa acc caa ttg gcg gag caa caa acc tct 1584 Arg Gln Val Glu Ala Leu Glu Thr Gln Leu Ala Glu Gln Gln Thr Ser 515 520 525 ggc aaa tcc ctc tta cgg gaa gaa gtt tta gag tct gac att gct gaa 1632 Gly Lys Ser Leu Leu Arg Glu Glu Val Leu Glu Ser Asp Ile Ala Glu 530 535 540 att atc tcg aaa tgg acc ggc att ccc atc agt aaa ttg gtg gaa tcg 1680 Ile Ile Ser Lys Trp Thr Gly Ile Pro Ile Ser Lys Leu Val Glu Ser 545 550 555 560 gaa aaa gaa aaa ctg ctc cac ttg gaa gat gaa cta cac agc cga gtg 1728 Glu Lys Glu Lys Leu Leu His Leu Glu Asp Glu Leu His Ser Arg Val 565 570 575 att ggt cag gat gaa gcg gta acc gcc gta gcc gaa gcc att caa cgc 1776 Ile Gly Gln Asp Glu Ala Val Thr Ala Val Ala Glu Ala Ile Gln Arg 580 585 590 tcc cga gct ggt ctt tcc gat cct aat cgt ccc acc gct agc ttt att 1824 Ser Arg Ala Gly Leu Ser Asp Pro Asn Arg Pro Thr Ala Ser Phe Ile 595 600 605 ttt ctg ggc ccc aca ggg gtc ggg aaa act gag tta gcg aag gct ttg 1872 Phe Leu Gly Pro Thr Gly Val Gly Lys Thr Glu Leu Ala Lys Ala Leu 610 615 620 gcg aaa aat tta ttc gac acg gaa gaa gcc ctg gtg cgg att gat atg 1920 Ala Lys Asn Leu Phe Asp Thr Glu Glu Ala Leu Val Arg Ile Asp Met 625 630 635 640 tct gaa tat atg gaa aaa cac gct gtt tcc cgt tta atg ggg gcc cct 1968 Ser Glu Tyr Met Glu Lys His Ala Val Ser Arg Leu Met Gly Ala Pro 645 650 655 ccg ggc tat gtg ggc tat gaa gaa ggg gga caa ttg acg gaa gca att 2016 Pro Gly Tyr Val Gly Tyr Glu Glu Gly Gly Gln Leu Thr Glu Ala Ile 660 665 670 cgc cgc cgg ccc tat tcg gtc att ctt ttt gac gag att gaa aaa gcc 2064 Arg Arg Arg Pro Tyr Ser Val Ile Leu Phe Asp Glu Ile Glu Lys Ala 675 680 685 cat ggg gat gtg ttt aac gtc atg ctc caa atc ctg gat gat ggc cgt 2112 His Gly Asp Val Phe Asn Val Met Leu Gln Ile Leu Asp Asp Gly Arg 690 695 700 tta acc gat gcc caa ggc cat gtg gtg gac ttc aaa aat acg att atc 2160 Leu Thr Asp Ala Gln Gly His Val Val Asp Phe Lys Asn Thr Ile Ile 705 710 715 720 att atg acc agt aac ctg ggc tcc caa tac att ttg gat gtg gcg ggg 2208 Ile Met Thr Ser Asn Leu Gly Ser Gln Tyr Ile Leu Asp Val Ala Gly 725 730 735 gat gat agt cgt tat gaa gaa atg cgg agc cga gtt atg gat gta atg 2256 Asp Asp Ser Arg Tyr Glu Glu Met Arg Ser Arg Val Met Asp Val Met 740 745 750 cgg gaa aac ttc cgc cca gaa ttt ctc aat cgg gtg gat gaa acg att 2304 Arg Glu Asn Phe Arg Pro Glu Phe Leu Asn Arg Val Asp Glu Thr Ile 755 760 765 att ttc cat ggc tta caa aaa tcc gag tta cga tcc att gtc caa att 2352 Ile Phe His Gly Leu Gln Lys Ser Glu Leu Arg Ser Ile Val Gln Ile 770 775 780 caa att cag tct ttg gct acc cgt ttg gag gaa caa aaa tta act ttg 2400 Gln Ile Gln Ser Leu Ala Thr Arg Leu Glu Glu Gln Lys Leu Thr Leu 785 790 795 800 aag tta acg gat aaa gcc cta gat ttt ctg gct gcc gtg ggc tat gac 2448 Lys Leu Thr Asp Lys Ala Leu Asp Phe Leu Ala Ala Val Gly Tyr Asp 805 810 815 ccc gtt tat ggg gcc cga cct tta aaa cga gcc gtc caa aaa tac cta 2496 Pro Val Tyr Gly Ala Arg Pro Leu Lys Arg Ala Val Gln Lys Tyr Leu 820 825 830 gaa acg gcg atc gcc aag gga att tta cgg ggg gat tac aaa cct ggt 2544 Glu Thr Ala Ile Ala Lys Gly Ile Leu Arg Gly Asp Tyr Lys Pro Gly 835 840 845 gag acc att gtg gtg gat gaa acc gac gaa cgc ctc agt ttt acc agt 2592 Glu Thr Ile Val Val Asp Glu Thr Asp Glu Arg Leu Ser Phe Thr Ser 850 855 860 tta agg ggg gat tta gtc atc gtt tag 2619 Leu Arg Gly Asp Leu Val Ile Val 865 870 38 872 PRT Synechocystis sp. strain PCC6803 38 Met Gln Pro Thr Asp Pro Asn Lys Phe Thr Glu Lys Ala Trp Glu Ala 1 5 10 15 Ile Ala Lys Thr Pro Glu Ile Ala Lys Gln His Arg Gln Gln Gln Ile 20 25 30 Glu Thr Glu His Leu Leu Ser Ala Leu Leu Glu Gln Asn Gly Leu Ala 35 40 45 Thr Ser Ile Phe Asn Lys Ala Gly Ala Ser Ile Pro Arg Val Asn Asp 50 55 60 Gln Val Asn Ser Phe Ile Ala Gln Gln Pro Lys Leu Ser Asn Pro Ser 65 70 75 80 Glu Ser Ile Tyr Leu Gly Arg Ser Leu Asp Lys Leu Leu Asp Asn Ala 85 90 95 Glu Ile Ala Lys Ser Lys Tyr Gly Asp Asp Tyr Ile Ser Ile Glu His 100 105 110 Leu Met Ala Ala Tyr Gly Gln Asp Asp Arg Leu Gly Lys Asn Leu Tyr 115 120 125 Arg Glu Ile Gly Leu Thr Glu Asn Lys Leu Ala Glu Ile Ile Lys Gln 130 135 140 Ile Arg Gly Thr Gln Lys Val Thr Asp Gln Asn Pro Glu Gly Lys Tyr 145 150 155 160 Glu Ser Leu Glu Lys Tyr Gly Arg Asp Leu Thr Glu Leu Ala Arg Glu 165 170 175 Gly Lys Leu Asp Pro Val Ile Gly Arg Asp Glu Glu Val Arg Arg Thr 180 185 190 Ile Gln Ile Leu Ser Arg Arg Thr Lys Asn Asn Pro Val Leu Ile Gly 195 200 205 Glu Pro Gly Val Gly Lys Thr Ala Ile Ala Glu Gly Leu Ala Gln Arg 210 215 220 Ile Ile Asn His Asp Val Pro Glu Ser Leu Arg Asp Arg Lys Leu Ile 225 230 235 240 Ser Leu Asp Met Gly Ala Leu Ile Ala Gly Ala Lys Tyr Arg Gly Glu 245 250 255 Phe Glu Glu Arg Leu Lys Ala Val Leu Lys Glu Val Thr Asp Ser Gln 260 265 270 Gly Gln Ile Ile Leu Phe Ile Asp Glu Ile His Thr Val Val Gly Ala 275 280 285 Gly Ala Thr Gln Gly Ala Met Asp Ala Gly Asn Leu Leu Lys Pro Met 290 295 300 Leu Ala Arg Gly Ala Leu Arg Cys Ile Gly Ala Thr Thr Leu Asp Glu 305 310 315 320 Tyr Arg Lys Tyr Ile Glu Lys Asp Ala Ala Leu Glu Arg Arg Phe Gln 325 330 335 Glu Val Leu Val Asp Glu Pro Asn Val Leu Asp Thr Ile Ser Ile Leu 340 345 350 Arg Gly Leu Lys Glu Arg Tyr Glu Val His His Gly Val Lys Ile Ala 355 360 365 Asp Ser Ala Leu Val Ala Ala Ala Met Leu Ser Asn Arg Tyr Ile Ser 370 375 380 Asp Arg Phe Leu Pro Asp Lys Ala Ile Asp Leu Val Asp Glu Ala Ala 385 390 395 400 Ala Lys Leu Lys Met Glu Ile Thr Ser Lys Pro Glu Glu Leu Asp Glu 405 410 415 Val Asp Arg Lys Ile Leu Gln Leu Glu Met Glu Arg Leu Ser Leu Gln 420 425 430 Arg Glu Asn Asp Ser Ala Ser Lys Glu Arg Leu Glu Lys Leu Glu Lys 435 440 445 Glu Leu Ala Asp Phe Lys Glu Glu Gln Ser Lys Leu Asn Gly Gln Trp 450 455 460 Gln Ser Glu Lys Thr Val Ile Asp Gln Ile Arg Thr Val Lys Glu Thr 465 470 475 480 Ile Asp Gln Val Asn Leu Glu Ile Gln Gln Ala Gln Arg Asp Tyr Asp 485 490 495 Tyr Asn Lys Ala Ala Glu Leu Gln Tyr Gly Lys Leu Thr Asp Leu Gln 500 505 510 Arg Gln Val Glu Ala Leu Glu Thr Gln Leu Ala Glu Gln Gln Thr Ser 515 520 525 Gly Lys Ser Leu Leu Arg Glu Glu Val Leu Glu Ser Asp Ile Ala Glu 530 535 540 Ile Ile Ser Lys Trp Thr Gly Ile Pro Ile Ser Lys Leu Val Glu Ser 545 550 555 560 Glu Lys Glu Lys Leu Leu His Leu Glu Asp Glu Leu His Ser Arg Val 565 570 575 Ile Gly Gln Asp Glu Ala Val Thr Ala Val Ala Glu Ala Ile Gln Arg 580 585 590 Ser Arg Ala Gly Leu Ser Asp Pro Asn Arg Pro Thr Ala Ser Phe Ile 595 600 605 Phe Leu Gly Pro Thr Gly Val Gly Lys Thr Glu Leu Ala Lys Ala Leu 610 615 620 Ala Lys Asn Leu Phe Asp Thr Glu Glu Ala Leu Val Arg Ile Asp Met 625 630 635 640 Ser Glu Tyr Met Glu Lys His Ala Val Ser Arg Leu Met Gly Ala Pro 645 650 655 Pro Gly Tyr Val Gly Tyr Glu Glu Gly Gly Gln Leu Thr Glu Ala Ile 660 665 670 Arg Arg Arg Pro Tyr Ser Val Ile Leu Phe Asp Glu Ile Glu Lys Ala 675 680 685 His Gly Asp Val Phe Asn Val Met Leu Gln Ile Leu Asp Asp Gly Arg 690 695 700 Leu Thr Asp Ala Gln Gly His Val Val Asp Phe Lys Asn Thr Ile Ile 705 710 715 720 Ile Met Thr Ser Asn Leu Gly Ser Gln Tyr Ile Leu Asp Val Ala Gly 725 730 735 Asp Asp Ser Arg Tyr Glu Glu Met Arg Ser Arg Val Met Asp Val Met 740 745 750 Arg Glu Asn Phe Arg Pro Glu Phe Leu Asn Arg Val Asp Glu Thr Ile 755 760 765 Ile Phe His Gly Leu Gln Lys Ser Glu Leu Arg Ser Ile Val Gln Ile 770 775 780 Gln Ile Gln Ser Leu Ala Thr Arg Leu Glu Glu Gln Lys Leu Thr Leu 785 790 795 800 Lys Leu Thr Asp Lys Ala Leu Asp Phe Leu Ala Ala Val Gly Tyr Asp 805 810 815 Pro Val Tyr Gly Ala Arg Pro Leu Lys Arg Ala Val Gln Lys Tyr Leu 820 825 830 Glu Thr Ala Ile Ala Lys Gly Ile Leu Arg Gly Asp Tyr Lys Pro Gly 835 840 845 Glu Thr Ile Val Val Asp Glu Thr Asp Glu Arg Leu Ser Phe Thr Ser 850 855 860 Leu Arg Gly Asp Leu Val Ile Val 865 870 39 198 DNA Synechocystis sp. strain PCC6803 CDS (1)..(198) 39 atg ttc gcc ccc atc gtt atc ttg gtt cgt caa cag tta ggc aaa gct 48 Met Phe Ala Pro Ile Val Ile Leu Val Arg Gln Gln Leu Gly Lys Ala 1 5 10 15 aag ttc aat cag atc cgc ggt aag gcg att gcc ctc cac tgc cag acc 96 Lys Phe Asn Gln Ile Arg Gly Lys Ala Ile Ala Leu His Cys Gln Thr 20 25 30 atc acc aac ttt tgt aac cgg gtg ggc atc gat gcc aaa cag cgc caa 144 Ile Thr Asn Phe Cys Asn Arg Val Gly Ile Asp Ala Lys Gln Arg Gln 35 40 45 aat tta atc cgt tta gct aag tcc aac ggc aaa acc ctc ggt tta ttg 192 Asn Leu Ile Arg Leu Ala Lys Ser Asn Gly Lys Thr Leu Gly Leu Leu 50 55 60 gcc taa 198 Ala 65 40 65 PRT Synechocystis sp. strain PCC6803 40 Met Phe Ala Pro Ile Val Ile Leu Val Arg Gln Gln Leu Gly Lys Ala 1 5 10 15 Lys Phe Asn Gln Ile Arg Gly Lys Ala Ile Ala Leu His Cys Gln Thr 20 25 30 Ile Thr Asn Phe Cys Asn Arg Val Gly Ile Asp Ala Lys Gln Arg Gln 35 40 45 Asn Leu Ile Arg Leu Ala Lys Ser Asn Gly Lys Thr Leu Gly Leu Leu 50 55 60 Ala 65 

What is claimed is:
 1. A method for regulating expression of a coding region of interest in a cyanobacterium comprising: a) providing a transformed cyanobacterium having a gene fusion comprising: i) a promoter region from a gene selected from the group consisting of: 1) an amiC gene or an rbcX gene; and 2) a gene having a nucleotide sequence as set forth in SEQ ID NO: 5; and ii) a coding region of interest; wherein the promoter region is operably linked to the coding region of interest; and b) culturing the transformed cyanobacterium of step (a), in the log phase whereby the promoter region is activated and the coding region of interest is expressed.
 2. A method according to claim 1, wherein the promoter region is from a gene encoding a polypeptide having the amino acid sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4 and SEQ ID NO:6.
 3. A method for regulating expression of a coding region of interest in a cyanobacterium comprising: a) providing a transformed cyanobacterium having a gene fusion comprising: i) a promoter region from a gene selected from the group consisting of: 1) an hliB gene, an hsp17 gene, a nblB gene, a rpoD gene, an hliA gene, a ftsH gene and a clpB gene; and 2) a gene having a nucleotide sequence selected from the group consisting of SEQ ID NOs:9, 11, 17, 21, 25, 27, 31, and 39; and ii) a coding region of interest; wherein the promoter region is operably linked to the coding region of interest; and b) culturing the transformed cyanobacterium of step (a) in the presence of UV-B light, whereby the promoter region is activated and the coding region of interest is expressed.
 4. A method according to claim 3, wherein the promoter region is from a gene encoding a polypeptide having the amino acid sequence selected from the group consisting of SEQ ID NOs:8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, and
 40. 5. A method according to claim 3, wherein the UV-B light has a wavelength of from about 290 nm to about 330 nm.
 6. A method according to claim 3, wherein the UV-B light has an intensity of from about 20 μES⁻¹ m⁻² to about 80 μES⁻¹ m⁻².
 7. A method according to either of claims 1 or 3, wherein the cyanobacterium is selected from the group consisting of Asterocapsa Aphanizomenon Microcystis Cylindrospermum Anacystis psychrophilic Anabaena Nostoc, Tychonema, Planktothrix Lyngbya Schizothrix Nodularia Synechocystis and Synechococcus.
 8. A method according to claim 7, wherein the cyanobacterium is selected from the group consisting of Synechocystis and Synechococcus
 9. A method according to either of claims 1 or 3, wherein the promoter region is derived from a cyanobacterium.
 10. A method according to claim 9, wherein the promoter region is derived from the group consisting of Asterocapsa Aphanizomenon Microcystis Cylindrospermum Anacystis psychrophilic Anabaena Nostoc, Tychonema, Planktothrix Lyngbya Schizothrix Nodularia Synechocystis and Synechococcus.
 11. A method according to claim 10, wherein the promoter region is derived from the group consisting of Synechocystis and Synechococcus.
 12. A method according to either of claims 1 or 3, wherein the coding region of interest is endogenous to the cyanobacterium.
 13. A method according to either of claims 1 or 3, wherein the coding region of interest is heterologous to the cyanobacterium.
 14. The method according to either of claims 1 or 3, wherein the coding region of interest is selected from the group consisting of crtE, crtB, pds, crtD, crtL, crtZ, crtX crtO, phaC, phaE, efe, pdc, adh, genes encoding limonene synthase, pinene synthase, bornyl synthase, phellandrene synthase, cineole synthase, sabinene synthase, and taxadiene synthase
 15. The method according to either of claims 1 or 3, wherein the gene fusion resides on a plasmid in the transformed cyanobacterium.
 16. The method according to either of claims 1 or 3, wherein the gene fusion is chromosomally integrated in the cyanobacterium genome. 